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1.  INTRODUCTION 

In  many  cases  it  is  desirable  to  have  a  compiler  of  a  programming 
language  produce  an  output  code  which  is  economical  with  respect  to 
some  cost  criterion  such  as  program  size  or  program  speed.   Optimizing 
a  given  program  is  usually  done  by  an  application  of  transformations 
to  some  intermediate  language  representation  of  the  program.   There  are 
many  possible  transformations  that  can  be  applied  and  compilers  which 
use  transformations  have  been  built  (7,2).   Ideally,  a  theory  of  code 
simplification  should  provide  a  machine  independent  mechanism  for  re- 
ducing a  given  program  to  an  equivalent  program  which  is  in  a  simplest 
(  in  some  sense)  possible  form. 

The  existence  of  algorithms  for  simplifications  is  closely  connect- 
ed with  the  question  of  the  solvability  of  the  equivalence  of  programs. 
If  the  equivalence  problem  was  solvable,  then  an  algorithm  for  reducing 
a  program  to  a  simplest  form  would  exist  in  principle.   It  can  be  shown 
that  for  almost  any  reasonable  notion  of  equivalence  between  computer 
programs,  the  question  of  equivalence  of  pairs  of  programs  is  not  par- 
tially decidable  (8).   There  is  no  effective  procedure  for  determining 
whether  or  not  two  programs  are  equivalent.   Therefore  in  general,  we 
can  not  find  a  finite  collection  of  equivalence  preserving  transformations 
so  that  any  pair  of  equivalent  programs  can  be  transformed  one  into  the 
other  by  applying  a  finite  sequence  of  these  transformations. 

Despite  the  undecidability  of  the  theory  in  general,  positive 
theoretical  results  can  be  obtained  in  many  cases  and  efforts  have  been 


made  to  isolate  subclasses  of  programs  for  which  the  equivalence  problem 
is  decidable  (8,  9) >  and  to  find  complete  sets  of  equivalence  preserving 
transformations  which  can  be  applied  to  decidable  subclasses  of  programs 

(1). 

Aho  and  Ullman  (l)  considered  a  type  of  program  schema  that  models 
straight  line  code.   For  this  case  they  found  a  complete  set  of  equiva- 
lence preserving  transformations  and  showed  that  these  transformations 
can  be  applied  to  get  an  optimal  code.  They  extended  their  results  to 
cases  in  which  certain  types  of  algebraic  laws  are  assumed. 

The  purpose  of  this  thesis  is  to  consider  a  program  schema  that  models 
loop-free  programs  and  to  extend  Aho's  and  Ullman' s  results  to  this  case. 
For  this  subclass  of  programs  the  equivalence  problem  is  decidable  (9)« 

In  Chapter  2  the  model  of  loop-free  program  schema  is  presented. 
A  program  schema  will  represent  a  family  of  computer  programs  in  the 
sense  that  if  the  operator  and  test  names  are  given  a>  particular  inter- 
pretation the  schema  becomes  a  program  that  can  be  executed  by  an  ideal- 
ized computer. 

In  Chapter  3  the  notion  of  equivalence  of  loop-free  programs  is  de- 
fined and  in  Chapter  h-   a  set  of  equivalence  preserving  transformations 
is  presented.   This  set  extends  Aho's  and  Ullman 's  set  of  transformations. 
Also  their  representation  of  programs  by  directed  acyclic  graphs  is  used. 
The  set  of  transformations  presented  is  shown'  to  be  complete,  i.e.  two 
programs  are  equivalent  if  and  only  if  they  can  be  transformed  one  into 
the  other  by  a  sequence  of  these  transformations. 

In  Chapter  5  a  schema  for  optimization  is  provided  in  which  a  sequence 
of  the  transformations  is  applied  to  get  an  optimal  code. 


Chapter  6  extends  the  results  of  the  previous  chapters  to  the  model 
of  program  schemata  that  assumes  that  a  set  of  algebraic  laws  holds  among 
the  operators.   It  is  shown  that  in  this  case  two  programs  are  equivalent 
if  and  only  if  they  can  be  transformed  one  into  the  other  by  a  set  of 
topological  and  algebraic  transformations.   It  is  also  shown  that  certain 
types  of  local  optimization  techniques  can  be  considered  as  algebraic 
identities  that  hold  among  operators,  operands  and  constants. 

In  Chapters  7  and  8  the  results  are  extended  to  the  model  of  program 
schemata  in  which  the  tests  are  Boolean  functions  of  elementary  tests. 
It  is  proved  that  in  this  case  two  programs  are  equivalent  if  and  only  if 
they  can  be  transformed  one  into  the  other  by  a  set  of  topological  and 
logical  transformations.   Chapter  8  provides  a  scheme  for  optimization 
that  uses  topological  and  logical  transformations. 

In  Chapter  9  "the  subclass  of  programs  which  always  halt  is  considered. 
For  this  subclass  the  equivalence  problem  is  decidable  (8,  9)  although 
membership  in  this  class  is  not  (8).   It  is  shown  that  the  procedure  for 
optimizing  loop-free  programs  might  be  used  to  optimize  programs  which 
always  halt.   Certain  transformations  that  are  known  to  improve  the  code 
of  programs  with  loops  are  shown  to  be  equivalent  to  sequences  of  trans- 
formations on  loop-free  programs. 


2.   THE  PROGRAM  SCHEMA 

Let       £  be  a  countable  alphabet  of  variable  names, 

6  a  countable  set  of  operator  names, 

T  a  countable  set  of  test  names. 

statements  are  of  three  types: 

(i)   assignment  statements 

k  A  -  0  B. ... B 
1    r 

A,B1,...,Bre  Z 

k  is  a  numeral  which  is  optional,  and  is  the  address  of  the  statement. 

9   is  an  r-ary  operator  name,  9   €  9. 

The  variable  A  is  assigned  a  new  value,  which  depends  on  the  current 

values  of  B  , . . .  ,B  and  on  the  unspecified  operator  9. 

We  say  that  A  is  defined  by  this  statement  and  B  , .  .  .  ,B  are  referenced 

by  this  statement. 

(ii)   test  statements 

k  t(c1,...,cr)  k^kg 

teT,  0,,...,C  e  Z 
1      r 

k,k  ,k  are  numerals,   k  is  optional,   k  ,k  may  be  equal. 

Control  goes  to  the  statement  with  the  prefix  le  if  the 

predicate  t(C  ,  ...,C  )  is  true,  otherwise  to  the  statement  with  the 

prefix  kp. 

kn  and  k^  are  called  transfer  addresses.   We  say  that  C,,...,C 
12  J  1'    '   r 

are  referenced  by  this  statement. 
(iii)  the  statement  STOP. 

A  Program  Schema  II  is  a  triple  (P,I,U)  where  P  is  a  finite  sequence  of 
statements  and  I,U  are  finite  sets  of  variables-  input  and  output  respect- 
ively. 


A  loop- free  program  schema  is  a  program  schema  that  does  not  have  loops - 
i.e.  a  transfer  address  of  a  test  statement  never  references  a  statement 
which  precedes  the  test  statement  in  some  possible  sequence  of  statements 
of  n. 

We  will  deal  with  loop- free  program  schemata,  thus  throughout 
this  thesis  a  program  schema  should  mean  a  loop-free  program  schema. 

No  transfer  address  references  a  numeral  which  is  not  a  prefix 
of  some  statement,  and  no  statement  has  the  same  prefix  as  another 
statement  in  the  program  schema. 
EXAMPLE  1 

TI  =(P,{X,Y},{Z}) 
P:  L  *-  CPXY 

t(L1)  3,5 

3  Z  «-  SI^ 
STOP 

5   L2  -  QL± 
Z-^L2 
STOP 

cp/^eee  X,Y,Z,LrL2e  S  teT 

A  program  schema  represents  a  family  of  computer  programs. 
To  provide  an  interpretation  for  a  program  schema  we  choose  some  finite 
or  infinite  set  of  values  (domain),  then  make  an  assignment  of  values 
from  the  set  to  each  input  variable  and  assignments  of  appropriate 
functions  and  predicates  on  the  set  to  the  operator  and  test  names  of 
the  program  schema.   Given  such  an  interpretation  the  schema  becomes  a 
program  which  can  be  executed  by  a  computer. 


An  interpretation  for  the  program  schema  II  of  example  1  could 
be  as  follows: 

The  domain  set  is  the  set  of  real  numbers. 

The  test  name  t(x)  is  interpreted  as  the  predicate  In(t)(x) 
which  is  true  if  x  >  0  and  is  false  otherwise.  The  interpretation  of 
the  operator  names  <P,¥,0  is  as  follows: 

In(cpXY)  =  +XY 

In(¥X)  =  SQRT  X 

in(ex)  =  -X 

The  interpreted  program  is 

In(P):    L  -  +XY 

>0(\)    3,5 
3   Z  «-  SQRT  L 

STOP 
5   L2<-  -Lx 

Z  -  SQRT  L2 
STOP 
Under  this  interpretation  the  program  schema  computes  the  square 
root  of  the  absolute  value  of  X+Y. 

Formally,  an  interpretation  In  of  a  program  II  is  a  mapping 
from  the  set  of  input  variables,  operator  names  and  test  names  of  II 
into  a  set  D  and  the  set  of  functions  and  predicates  such  that 
(i)  each  variable  Ael  is  assigned  an  element  In(A)eD. 

(ii)  each  r-ary  operator  name  G   is  assigned  an  r-ary  function  In(©):D  -»D. 
(iii)  each  test  name  t  is  assigned  a  predicate  on  D,  In(t) :  D  —  {T,F}. 


We  shall  call  program  schemata  abstract  programs  or  programs . 

An  interpretation  is  called  an  actual  program  or  a  computer  program. 

A  block  is  a  sequence  of  assignment  statements  Sn,...,S  n 

1      n-1 

with  a  STOP  or  a  test  statement  S  ,  and  either 

n 

(i)   S  follows  a  test  statement 

or 
(ii)  S  follows  a  STOP  statement 

or 
(iii)  S  is  the  first  statement  in  P. 

P,  the  sequence  of  statements  of  the  program  schema  can  be 
represented  by  a  directed  graph.   The  graph  representing  P  will  be  called 
the  graph  corresponding  to  the  program  schema  or  simply,  the  graph  of  the 
program  schema. 

The  graph  corresponding  to  the  program  II  of  Example  1  is 


Z  <-  ^L 


STOP 


L  -  cpXY 


L2  -  9L± 
Z  «-  V 


L2 


STOP 


A  directed  graph  corresponding  to  a  program  schema  has  the 
following  properties : 
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(i)    it  is  acyclic  (i.e.  it  has  no  loops). 

(ii)   it  has  only  one  root. 

(iii)  each  node  can  have  one  or  two  descendants,  or  it  is  a  leaf. 

( iv)       each  node  can  have  several  ancestors. 

We  will  often  represent  a  program  schema  II  =  (P,I,U)  by 
(G  , I,U)  where  G  is  the  directed  graph  representing  P. 

Each  path  of  the  graph  of  TI  corresponds  to  a  possible  sequence 
of  statements  of  the  program  II . 

A  path  will  be  called  executable  if  the  corresponding  sequence 
of  statements  can  be  executed  under  some  interpretation  of  the  program 
schema. 

A  path  will  be  called  nonexecutable  if  under  no  interpretation 
(i.e.  assignments  of  input  values,  functions  and  predicates),  the 
corresponding  sequence  of  statements  can  be  executed. 
EXAMPLE  2 

II  =  (G,{L^Ii3),  (L21) 


G: 


STOP 


STOP 


The  path  corresponding  to  the  sequence  of  statements 

L2  -  FL3 

t(L2) 

t(L3) 

L2  «-  BLl 

STOP 
is  nonexecutable.   Under  no  assignment  of  values  to  the  input  variables 
L,  and  L,  and  functions  to  F,  B  and  a  predicate  to  t  this  sequence  of 
statements  can  be  executed. 

In  the  following  chapter  we  shall  present  a  theorem  which 
characterizes  executable  and  nonexecutable  paths  in  terms  of  the  test 
statements  appearing  in  the  paths. 
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3.   EQUIVALENCE  OF  FROGRAM  SCHEMATA 

Two  program  schemata  will  be  called  equivalent  iff  for  every 
interpretation  the  output  values  of  the  two  actual  programs  are  equal. 
Formally  the  notion  of  program  equivalence  will  be  defined  in  the  follow- 
ing way: 

Let  L  be  the  finite  set  of  all  executable  paths  of  the  program 
n.   For  each  path  £eL  there  is  a  corresponding  sequence  of  statements 

s.z,...,sz. 

1      m 

If  path  i   is  executed  under  some  interpretation  In,  val  (II ) 

will  be  defined  to  be  the  vector  of  the  output  variables  after  executing 

I  I 

S  , ...,S  ,   and  will  be  called  the  value  of  the  program  schema  under 

interpretation  In. 

DEFINITION 

Two  programs  schemata  II  and  H f  will  be  called  equivalent 
(n=lT)  iff  for  all  interpretations  In  val  (II)  =  val  (II1). 

Equivalence  of  program  schemata  will  be  characterized  in  terms 
of  the  sets  of  terminal  expressions  computed  along  executable  paths. 

We  define  the  expression  v  (A),  the  value  of  the  variable  i 

K 

I 

after  executing  S  as  follows: 

1)  v'(A)  =  A  for  all  Ael. 

2)  If  sf  is  a  test  or  a  STOP  statement  vf  (A)  =  v  *  (A) . 

k  k  k-lv 

0  n 

3)  If  S     is  A  *-  cdb   .  ..B     then  V_(A)    is   the  expression 

and  for  all  C,    C/A,      v^   (C)    =  vf    .(C). 

4)  v   (A)    is  undefined  otherwise. 
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The  value  of  the  executable  path  *■ ,    denoted  by  v  (II )  is 


{v  (A),  A€U} 


m 


EXAMPLE  3 


If  n  is  of  Example  1,  1T=  (G  ,  {X,Y} ,  {Z}) 


L  *-   cpXY 


STOP 


STOP 


v  1(n)  =  v^1  (Z)  -  ¥<pxy 


and  I.    and  I      are  the  left  and  right  paths  respectively,  then 
t 

r 
I  I 

v  2(n)  =  vc2  (z)  =  ^e9XY 

A  program  IT  is  said  to  be  proper  if  V  i,  I   is  an  executable 

th  £ 

path,  whenever  a  variable  B  is  referenced  by  the  k   statement  of  £  (S  ) 

then  v,  ,  (B)  is  defined. 
k-1 

We  will  deal  with  proper  programs  only,  thus  throughout  this 
thesis,  a  program  schema  or  a  program  should  mean  a  proper  loop-free 
program  schema. 
DEFINITION 

Let  II  and  n '  be  two  programs.  Let  I  and  k  be  two  executable 
paths  of  n  and  IP  respectively.  I  and  k  are  said  to  be  consistent  iff 
Vb  such  that 
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(i)  t  is  in  the  sequences  of  statements  that  correspond  to  both  I  and  k, 


so  that  in  path  I   there  is  a  statement  S.  of  the  form  t  (C   , .  .  .  ,C 


)Vn2 


n0, 


k  k 

and  in  path  k  there  is  a  statement  S.  of  the  form  t  (D......D  )m, ,nu 

J  1      r  1  2 

and 

2  k 

(ii)  v.(C  )  =  v.(D  )  for  all  q,  Kq<r,  that  is  the  values  of  the  var- 
l  q    J  q  —  - 

iables  referenced  by  the  statements  are  identical, then  the  statement 
prefixed  by  n  is  included  in  path  I   iff  the  statement  prefixed  by  m.  is 
included  in  path  k,  and  the  statement  prefixed  by  np  is  included  in  path 
I   iff  the  statement  prefixed  by  m  is  included  in  path  k. 
EXAMPLE  5 


K  =  (G„,{A,B,C,D],{C}) 


C  -  ^RB 


t,(A) 


-  92CD 


C  «-  ^2RB    C  -  0KB 


C  *-  TKB 
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IT  =  (Gpl,{A,B,C,D,},{C))     t  (B) 


tx(A) 

tx(A) 

R  *-  cp  CD 

R  -  cp2CD 

R    «-    ^-jCDy 

\? 

-  92CD 

C  -  ^KB/ 

\C  «-  9KB 

c  «-  ^2rb/ 

\C  «-  tRB 

STOP 

STOP 

STOP 

STOP 

h. 

k2 

S 

\ 

The  marked  paths  are  consistent. 

The  following  theorem  gives  a  simple  characterization  of  execu- 
table and  nonexecutable  paths  in  terms  of  the  test  statements  occuring  in 
the  paths. 
THEOREM  1 

A  path  £  of  a  program  IT  is  executable  iff  there  are  no  two 
test  statements  in  i  that  reference  variables  that  have  the  same  values 
and  give  different  truth  values. 
PROOF: 

1)  If  there  are  two  test  statements  that  reference  variables 
that  have  the  same  values  and  give  defferent  truth  values,  this  path  can 
not  be  execixted  under  any  interpretation,  and  therefore  is  nonexecutable. 
Thus  if  the  path  is  executable  there  are  no  tests  with  the  above  property. 

2)  Assume  there  are  no  two  tests  with  the  above  property. 
Then  path  I   will  be  executed  under  the  following  interpretation  In: 

The  domain  set  of  In  is  the  set  of  strings  of  variables  and 
operator  names. 


1). 


If  A  is  a  variable,  In(A)  =  A. 

If  cp  is  a  function  name,  ol  , . . .  ,a  are  strings,  then  In(cpa  . .  .a  ) 


is  <Pa,  . .  .a  ,  the  concatenation  of  cp  with  <x. , . . .  ,a  . 

£ 
If  t  is  a  test  name  in  path  £, 

1)  In(t  )(ql  , . . .  ,a  )=  T  for  all  (a.,..., a  )  such  that 

/v  1      n  1    '  n' 

a.    =  v.(X.),  l^.i^.n  and  t  (X,...,X  )  =  T  for  some  appearance 

£  £ 

S  .  of  t  in  path  £ . 

2)  In(t  )(a.,...,a  )  =  F  for  all  (a. , . . . ,a  )  such  that 

'     1  n  1'  n 

a.    -   v.(X. )  1  <  i  <  n  and  t  (Xn,...,X  )  =  F  for  some  appearance 
i    j   i     —   —  1'  n  ^^ 

It  It 

S.  of  t  in  path  It. 
3 

3)  In(t  )(a,  , . . . ,a  )   is  arbitrary  for  all  other  n-tuples 

Because  no  two  tests  in  I   reference  variables  that  have  the  same  values 
and  give  different  truth  values-the  definition  of  the  tests  is  unique, 
and  path  It   is  executed  under  the  interpretation  In.   Therefore  £   is 
executable. 
THEOREM  i 

II  =11'  iff  for  all  consistent  pairs  of  paths  £,k 
v£(H)  =  vk(ir). 

Theorem  2  states  that  two  programs  are  equivalent  iff  the  sets 
of  expressions  computed  along  consistent  pairs  of  paths  are  identical. 
Thus  deciding  if  two  programs  are  equivalent  can  be  done  by  checking 
that  terminal  expressions  are  identical  for  consistent  paths. 

Two  programs  II  and  H'  always  have  consistent  pairs  of  paths 
i,k.   For  each  path  in  one  program  there  is  always  a  consistent  path 
in  the  other. 
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The  proof  is  based  on  Luckham,  Park  and  Paterson(8). 
PROOF: 

l)   Let  11=  IT,  and  I   and  k  are  consistent.  lis  II'  therefore  by 

definition  for  all  interpretations  In  valT  (II)  =  val  „.  (II').   Take  In 

In         Inv   ' 

as  the  following  interpretation: 

The  domain  set  of  In,  the  interpretation  of  inputs  and  functions, 
are  the  same  as  in  Theorem  1.   The  test  names  are  interpreted  in  the 
following  way: 

If  t  is  a  test  name  in  path  j  =  £,k 

1)  In(tJ)(an , . . .  ,a  )  =  T  for  all  (a. , . . .  ,a  )  such  that  a.=vJ(X.) 

In  1'     n  l  m  l 

1  <  i  <  n  and  t  (X  , .  ..,X  )  =  T  for  some  appearance  S  of  t  in 
paths  j=£,k. 

2)  In(tJ)(a., , . . .  ,a  )  =  F  for  all  (a.,..., a  )  such  that  a.=vJ(X.), 

1      n  1'    '  n  i  mx    i '' 

1  <:  i  <   n  and  t  (X  ,  ...,X  )  =  F  for  some  appearance  S  of  t  in 
paths  j=i,k. 

3)  In  (t  )(a  , . . . ,a  )   is  arbitrary  for  all  other  n  -  tuples 
(o^, . • . ,an) . 

£,k  are  consistent,  therefore  the  definition  of  the  tests  is  unique, 
val  (II)  was  defined  to  be  the  vector  of  values  of  the  output  variables 
after  executing  the  program  under  interpretation  In.   Under  the  inter- 
pretation  In  above  path  I   in  II  is  executed  and  valT  (n)  =  v  (n).  Also 
under  this  interpretation  path  k  in  II '  is  executed  and  val  (H ' )  =  v  (H'). 

Therefore  vf(ll)  =  vk(n  ' ) . 

I  k 

2)   Let  v  (II)  =  v  (n')  for  all  consistent  pairs  of  paths  £,k. 

Let  In  be  any  interpretation.  We  have  to  show  that  val  (n)  =  val  (II'). 
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Let  i,k  be  the  paths  executed  under  the  interpretation  In  in  II  and  IT' 

I  k 

respectively.  Then  f,k  are  consistent.  Then  v  (IT )  =  v  (n').  But 

I  k 

if  v  (n)  =  v  (n')  then  val  (n)  =  val  (n')  because  if  the  expressions 

for  the  output  variables  are  equal,  then  if  we  substitute  the  functions 

and  inputs  we  will  get  equal  values.   (We  used  here  the  fact  that  the 

Polish  notation  representation  of  expressions  is  nonambiguous ) . 

EXAMPLE  6 

a)  II  and  II'  of  Example  5  are  equivalent.   The  consistent  pairs 

of  paths  are  I     and  k  ,  I     and  k  ,  £,  and  k  ,  L    and  k.  . 


v  (n)  =  Y  cp  CDB 

v  (II)  =  ¥  q>  CDB 

v  "-'(IT)  =  ecp  CDB 

\ 

v  (n)  =  rep  CDB 


v  (IT)  =  ^  cpiCDB 
v  -'(IT)  =  ^cp  CDB 


v  (n1)  =  0cp2CDB 

v  (n*)  =  tcp2cdb 


Therefore  for  all  consistent  pairs  of  paths  the  expressions 
computed  are  identical.   Thus  II  =11'. 

b)  nx  =  (Gp  ,  {A,B],{C})      H^  =  (Gpl  ,{A,B},  {C}) 


C-^AB 


°*1 


C«-<PAB 


STOP 


STOP 


*1 


^AB 


STOP 
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In  this  example  the  consistent  pairs  of  paths  are  (<  ,k  ), 

fig,   k^),    (ly        k2),    (tk,        1^). 


Vni- 
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4.   TRANSFORMATIONS  ON  PROGRAMS 


Let  n  be  a  program,  and  let  f  be  a  path  of  the  directed  graph 

I  I 

with  the  corresponding  sequence  of  statements  S  ,  ...,S  . 

I 
Suppose  S.  defines  the  variable  A. 

1)  If  A/U,  and  S.  is  the  last  statement  to  reference  this  instance  of 

I 

A  (i.e.  for  no  k  >j  S  references  A,  unless  A  is  defined  by  some 

K 

S  j  <  t  <  k)  then  the  scope  of  S  in  path  I  is  the  sequence  of  state- 

\j  J- 

£  I 

ments  S.  . , . . . ,S.. 

l+l      J 

I  I 

2)  AeU,  and  for  no  j  >  i  S.  defines  A,  then  the  scope  of  S  in  path  I 

J  -1- 

i  a 

is  S.  , -. . .,S  and  the  set  U. 

i+1'    '  n 
I 

3)  If  no  S.,  j  >  i  references  this  instance  of  A,  and  A  is  not  an 

J 

I  I 

output  variable,  then  the  scope  of  S.  is  null,  and  S.  is  said  to  be 

useless  in  path  I. 

Any  program  can  be  represented  by  a  set  of  labeled  directed 

acyclic  graphs  (dags).   For  each  executable  path  i   of  the  program  we 

construct  a  dag  D  (II)  as  follows: 

i)  For  all  A,  Ael,  we  create  a  leaf  labeled  by  A. 

I  I 

ii)  If  S  , ...,S  is  the  corresponding  sequence  of  statements,  for  each 

£ 

j  =  l,...,n  we  check  if  S.  is  an  assignment  statement.   If  yes  -  we 

I 
create  a  node  associated  with  it.   If  S .  is  of  the  form 

J 

A  *-   0B_  . .  .B 

1    r 

and  n_.....n  are  the  nodes  associated  with  the  most  recent  definitions 
1      r 

of  B  ,  ...,B  respectively  (or  if  some  Bel,  rL  is  the  leaf  with  that 
label),  then  the  node  associated  with  S.  has  a  label  0   and  direct  de- 

J 

scendants  n_ ,. . . ,n  . 
1     r 
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iii)  We  distinguish  by  circling  those  nodes  associated  with  the  last 
definitions  of  the  output  variables  or  the  nodes  associated  with  defini- 
tions of  variables  referenced  by  test  statements. 
EXAMPLE  T 

H  =  (G,(B1,B2,D),{M)) 


A  -  cpB1B2 


D(TT),  the  set  of  dags  of  IT,  is 


G 


. 


/  and  k  are  the  left  and  right  paths,  respectively. 
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A  transformation  on  a  program  is  a  mapping  to  the  set  of  pro- 
grams which  preserves  program  equivalence.  We  will  define  a  set  of  trans- 
formations that  operate  on  loop-free  programs. 
Tl  Removal  of  Useless  Assignment  Statements 


If  for  all  paths  I,    such  that  S.  is  in  the  corresponding 

sequence  of  statements  of  1,  S.  is  useless  in  I,    then  S.  can  be  removed 

from  the  program.   Also,  if  Ael,  A/U,  A  is  not  referenced  in  IT,  A  can 

be  removed  from  I. 

If  S.  is  removed,  all  references  to  S.  (addresses  of  test  state- 
1  l 

ments)  are  changed  to  reference  S.   . 

Tl  operates  on  all  the  dages  D   corresponding  to  paths  I. 

which  include  the  useless  statement  S..   Tl  deletes  a  node  which  is  not 

l 


distinguished  and  has  no  ancestors. 
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T2  Removal  of  Redundant  Assignment  Statements 


S .  A  «-  <PD,  .  •  •  D 
1       1    r 


S  .  B  -  cpDn  .  .  .  D 
0       1    r 


S.  C  -  cpD_  .  .  .D 
l       1    r 


S.  deleted 
J 


If  S?  is  A  -  cpD  , . .  .D 
1    r 

S.  is  B  -  <PD,  .  .  .D 
r 

/      / 

S.  and  S.  are  in  tho  sequence  of  statements  that  correspond  to  path  I, 

2  2 

and  there  is  no  path  that  includes  S.  and  not  S.,  and  also  for  each  path 

2  2  2 

that  includes  both  S.  and  S.,  D  , ...,D  are  not  defined  by  any  S  i  <   k  <  j, 

l     j   i     r  k 

I 

then  S.  is  replaced  by 

C  -  <PD.  .  .  .  D  , 
1    r 

t  11 

S.  is  deleted,  and  all  references  to  A  and  B  in  the  scopes  of  S.  and  S. 
J  i      0 

I 
in  all  the  paths  that  include  S.  are  changed  to  references  to  C.  All 

I  t 

references  to  S.  are  replaced  by  references  to  S:  , . 
J  J+1 

T2  operates  on  all  the  dags  D.  (IT)  corresponding  to  paths  i 

k 
I      I 

which  include  both  S.  and  S. .   All  the  other  dags  remain  unchanged.   T? 

■*■      J 


corresponds  to  the  merging  of  two  nodes  with  identical  direct  descendants. 
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EXAMPLE  8 


n   =    (P,    {K,X,Y),    (B,D)) 


it    =   (P',    {K,X,Y},    {C,D}) 


D  +-  TDB 


D  *-  TDC 


"2  ~3  "2  "3 

(£n,k_),    (j?^,k_),  (^,k-,)   are  the  consistent  pairs  of  paths 

1    1     1        '    d     d  j>     j 

l\  kl 

v    (n)  =  {exY,  ^KcpxY]  =  v    (it) 

£2  k2 

v     (II )    =   {cpXY,    "KpXY^KCpXY}    =  v     (n ' ) 

s  s 

v  ^(11)    =    {CDXY,    T^KCpXYCpXY]    -  v  ^(lV) 


T2  transforms  II  to  II '  . 

The  sets  of  dags  for  II  and  II'  are  shown  on  the  next  page, 
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d,  (n) 
i 


d.  (n) 

2 


d  (n  ■ ) 

ki   ' 


k3 


T3  Renaming 
S 


A  -  cpBn  .  . .  B 
1    r 


We  replace  S.  by  S.'   -  C  *-   cpB  . .  .B  ,  and  all  references  to  A  in  the  scopes 
of  S.  in  all  the  paths  that  include  S.  are  replaced  by  references  to  C. 


Variable  names  do  not  appear  in  a  dag,  therefore  the  dags  are  not  affected, 


2k 


Tk   Flipping  of  Assignment  Statement: 


i+1 


S.  (k)   A  <-  cpB  .  ..Br 
S.+1  (k+l)   C^^Dr..Dq 

A  ^  {C,D1,...,D(i} 

C  ^  {Bi;...,Br} 
and  there  is  no  path  that  includes  Si+1  and  not  S±.      Then  S±  and  S±+1  may 
be  interchanged.   If  numerals  precede  the  statements,  they  are  interchanged 
accordingly.   T^  does  not  affect  the  dags. 
T5  Merging  of  Identical  Assignment  Statements 
(a) 


t(C1,...,Cq) 


)  -cPB1...Br 
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If  the  statement  t(C  ,  ...,C  )  k  ,k  is  in  paths  t   and  /'  i.e. 


and 

and 


Si  =  Si'=  t(c1»---»cq)  kxA; 

S.1-  =  S.\    =  D  -  <PB_...B 
l+l    i+l        1    r 


and  there  is  no  path  that  includes  S.  n(s.  ,)  and  does  not  include 

*  i+l  i+l 

t(C, , . .  .  ,C  ) ,  then  the  statement  D  <-  cpB..  . .  .B  is  moved  before  t(C, , . . .  ,C  ) 
1      q  1    r  1      q 

I     I ' 

and  S.  n,  S.  n  are  deleted.   The  prefix  of  the  statement  D  -  cpB,  ...D 
i+l'   i+l  ^  1    r 

is  that  of  t(Cn,...,C  ).   The  prefix  of  t(C  , ...,C  )  is  deleted,  k. ,k. 
1      q  1      q  1  ^ 

I  i ' 

in  the  test  statement  are  replaced  by  references  to  S.  p,  S.  p. 

(b) 


D  -  CpB    .  .  .  B 
1         r 


cpBn  ...B 
1         r 


If  sf    =  sf      =   D  *■  CB, ...B 
ii  1         r 

t       t' 

and  S.,  S.   are  the  last  assignment  statements  before  /  aid  /'  -re  merged, 

If  if 

then  S.  and  S.   can  be  merged.  All  references  to  S. ,  S.   are  replaced  by- 
references  to  the  merged  statement.  T5  does  not  affect  the  dags. 
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EXAMPLE  9 

It   .1    (P,(A,B), 

(L)) 

it  =  (p',(a,b),  CD) 

P  =  tx(A)   2,5 

P«    =  tx(A)   2,5 

2    N  *-  CPAB 

2    N  -  CpAB 

L  *-  ^NN 

t(B)   7,7 

t(B)    7,7 

5   N  -  SAB 

5    N  *-  GAB 

7    L  -  ¥NN 

L  ♦-  ^NN 

L  -  TLL 

7    L  -  TLL 

STOP 

STOP 

T5  transforms  II  to  II '  . 

t^A) 

tjA) 

N  -  CpAB/      \  N  *-  SAB 

N  -  CpAB/       \N  ♦-  0AB 

l  «-  fm/         \L  «-  ^NN 

t(BJ_ \ 

— ■> 

t(4__\ 

TLL 


STOP 


L  -  ^NN 
L  -  TLL 

STOP 


IT 


IT 


To  Removal  of  Nonexecutable  Paths 
(a)  If  S;?  =  t(A1,...,Ar)  k^kg 


S^  =  tCc^...,^)  kyk4 


j>i 


(*)  and  there  is  no  path  i   that  includes  S.  and  not  S. 
(**)and  Vk,  1  <  k  <  r 


w  ■  v  (V 


! 
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then 

/  t        I 

(i)  if  S.  is  on  the  right  branch  of  S.,  S.  is  changed  to  t(C  , ...,C  )k,  ,k,  , 

I  11 

(ii)if  S.  is  on  the  left  branch  of  S.,    S.    is   changed  to  t(C-,...,C   )k,,k,. 
0  i       j  1  r'    3     3 


tC^,...,^) 


tCc^...,^) 


t^,.     ^) 


t(c1,...,cr) 


and 


(b)  If  S*  =  t(Al,...Ar)k1,k2; 

(i)  s^  =  t(c1,...,cr)k3,ki| 

and  (*)  and  (**)  hold  for  sf  and  S*  then  we  change  sf  to  t(A_,...,A  )k  ,k 


and  S,   is  deleted. 
*1 


(ii)  s'  =  tC^,...,^)^,^ 

and  (*)  and  (**)  hold  for  sf  and  S*   ,  we  change  sf  to  t(A.  ,...,A  )k ,  ,k, 

l      k?  i       1      r  1  4 


i 
and  S.   is  deleted. 

k2 


t(A1,...,Ar) 


t(A1,...,Ar) 


t(clt...,cr) 
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Since  T6  gets  rid  of  tests,  it  uncircles  nodes  in  the  dags 

that  are  associated  only  with  variables  referenced  by  the  tests  removed. 

T7  Removal  of  Unconditional  Useless  Test  statements 

If  S.  is  of  the  form  t(C,,...,C  )k.  ,kn  and  le,  =  1+1.  then  S, 
1  1      n  1  1      1  i 

is  removed.  All  references  to  S.  are  changed  to  references  to  S.  ^  . 

Since  T7  removes  tests,  it  uncircles  nodes  in  the  dags  that 

are  associated  only  with  variables  referenced  by  the  tests  removed. 

T8  Flipping  of  Tests 

I 
If  S.  =  t  (A_,...,A  )  k  ,k2  and  S.  is  on  both  paths  I   and 

I  V 

I   ,  s.  =  s .  ,  , 


and  S^  =  t£,(B1,...,B  )  kj,^ 
andSkj  ■  W" 'V  k5'k6 


then  t_  and  t_  can  be  flipped  so  that  S.  =  S. ,  =  t„(Bn,...,B  )  kn ,k^ 
1      2  KB  i    i  '2   1 '         '    q'  1'   2 


\(A1,...>Ar)  k^k. 


ki  VBi'-"Bq)k3'k*   k2  V  V"'VV* 


Vi 
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t2(Bl,....B  )  ki;k2 


>  sequences  of  statements  for  the  same  combinations  of  tests 
are  not  changed,  therefore  the  transformation  does  not  affect  the  set 
of  dags . 
T9  Removal  of  Unreferenced  Blocks 

If  the  first  statement  of  a  block  is  not  referenced  (i.e.  either 
the  numeral  is  not  a  transfer  address  of  any  of  the  statements  of  P, 
or  the  first  statement  of  the  block  is  not  prefixed  by  a  numeral  and  the 
block  is  not  the  first  in  the  program)  then  the  whole  block  is  removed. 

If  the  first  statement  of  a  block  is  not  referenced,  there  is 
no  executable  sequence  of  statements  that  contains  the  block,  thus  T9 
does  not  affect  the  dags. 

An  example  of  the  use  of  T9  is  removing  a  useless  STOP  state- 


ment 


If  S.  is  STOP 

1 

and 


S.  ,  is  STOP 
l+l 


and  there  is  no  path  that  includes  S.  ,  and  does  not  include  S.,  then 
^  i+1  i 

T9  can  be  used  to  remove  S.  1 . 
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EXAMPLE  10 


path. 


IT  is  of  Example  2.  We  would  like  to  remove  the  nonexecutable 


n 


L2  .  FL3 


t(L2)3,5 


STOP 


STOP 


IT 


t(L2)3,5 


L2-F^ 


STOP 


IT  =  (Mlyl^},   {L2}) 


=     L2-FL5 

L,  -  FL3 

L,  -  FL3 

t(L2)  3,5 

t(L2)  3,5 

t(L2)  3,5 

3   L2  -  DLX 

3 

L2   .  DL]L 

3 

L2  -  DLl 

STOP 

t6 
> 

STOP 

T9 
> 

STOP 

T7 

; 

5   Lj  -  FL 

5 

Lj  -  FL; 

5 

L3^ 

t(L3)    7,9 

t(L?)  9,9 

tfL  )  9,9 
3 

T  L2  -  BLX 

7 

L2  *  BLl 

9 

L2-CL1 

STOP 

STOP 

STOP 

9  l2  -  C^ 

9 

L2~0LL 

STOP 

STOP 

n 
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t(I^)  5,5 

3  h  -  DL1 

STOP 

5  Lj  -  FLj 
9  Lg  -  CLX 
STOP 

n' 

TIO  Flipping  of  Blocks 

Any  two  blocks  'except  the  first)  can  be  flipped.   The  graph  of 
the  program  is  not  changed,  and  the  dags  are  not  affected. 
Til  Merging  of  Identical  Subgraphs 
a)  if  S*  =  tC^,...^)  kx,k2 

and  Vk,  Kk<r   v?(A,  )  =  v*     (C.  ) 

—  -     l  k     j    k 

and  S.,  S.  are  the  roots  of  identical  subgraphs  D. ,  D.,  respectively, 

then  D.  and  D.  are  merged,  S.   is  deleted,  and  all  references  to  state- 
i      j  J 

ments  in  D.  are  changed  to  references  to  statements  in  D. . 

Til 


t(c1,...,cr) 


t(A1,...,Ar) 


t(A1,...,Ar) 
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The  executable  paths  are  not  affected  by  this  transformation  there- 
fore the  dags  are  not  affected. 


b) 


Also,  if  the  subgraphs  D.  and  D.  are  identical,  and  also  the  se- 

i  I  £ '      f ' 

quences  of  assignment  statements  S  ,  ...,S  and  S  ,  ...,S   preceding 


n 


Q 


D.  and  D.  respectively,  are  identical,  then  a  test  statement  is  added  to 
the  program  and  the  sequences  of  assignment  statements  together  with  the 
subgraphs  are  merged. 


A  trivial  case  of  Tll(b)  which  will  be  used  later  is 
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Hero  D_L  and  D  -re  empty,  and  the  identical  sequences  of  statements 
are  merped. 

••■••••.•:.  " 

Let  II  and  II'  be  the  programs  of  Example  6(b).  We  would  like  to 
show  that  II  can  be  transformed  to  11'  by  applying  T5,  T7,  T8,  Til. 


T8 


VAA)  hW 

X  -  ^AB  C-tAB/ 


STOP  STOP 


.:t  . 


Ma) 


-  • 


STOP 


3h 


b,(A) 


♦-  ^AB 


stop 


C  -  cpAB/     \C  -  TAB 


STOP 


STOP 


STOP 


t,(A) 


C  -  cpAB 


STOP 


C  «-  ^AB 


STOP 


T7 


t,(A) 


C  -^AB 


STOP 


C  *-  CpAB 


STOP 


STOP 
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All  the  transformations  presented  above  are  equivalence  preserving. 
This  is  clear  from  the  definitions. 

We  denote  II  — r->  II  if  transformation  Ti  transforms  II  to  II '  . 

1 

i  =  1,2,3,.. .,11. 

We  define    .  ^  to  he  the  reverse  of    .  s>  . 

i  i 

We  say  71  0  >   where  S  c  (1,2,...,  11}  if  there  is  a  sequence  of 

programs  IT.  , . . .  ,11   II  _=  II  ,11  =  H'  and  for  all  i  II.   .  >  II .  ,  or 
1      m   1      m  i  .1     i+l 


H.  =t=^  II.  .,   jeS. 

0     i+l 


DEFINITION 


A  set  of  transformations  J  is  defined  to  be  complete  iff 

n  sir  =^  n  ==^  it 

:".'-'  .  '~' 

If  the  graph  corresponding  to  a  program  II  has  nonexecutable  paths, 
then  II  can  be  transformed  to  an  equivalent  program  II1  in  which  each  path 
is  executable,  by  applying  a  sequence  of  transformations  from  the  set 
S  =  (T6,  Til}. 
PROOF: 

If  path  i  is  nonexecutable,  by  Theorem  1  there  must  be  at  least 
two  tests  in  path  I   that  check  the  same  values  and  give  different  truth 
values. 

Let  S.f  be  t(A  , . .  .  ,A  )  and  S*   be  t(C  ,...,C  )  j  >  i  and 

X  _|_  X  J  A.  ± 

vl(\)   =   v^(ck)  1  <   k  £  T>   and  t(k1> .  ..,A  )  =  false,  t(C  , . . .  ,C  )  =  true. 
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The  following  cases  are  possible: 


iJtC^ 


s  t(c1j^-^cr) 


£      £ 

S.   and  S. 


(1)  £  is  the  only  path  that  includes 

In  this  case  conditions 
(*)  and  (**)  of  T6  hold,  and  T6  can 
be  applied  to  get  rid  of  the  non- 
executable path. 


(2)  S.  is  on  both  branches  of  S.,  and  there  is  no  path  that  includes  S. 
J  <J 

t 

and  not  S. .   In  this  case  we  operate  Til  in  reverse  and  then  To  on  both 
branches . 


R 


{11} 


{6} 
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(3)  There  is  at  least  one  path  that  includes  S.  and  not  S. .   In  this 
case  we  again  use  Til  in  reverse,  and  then  apply  T6  on  the  left  subtree 


{11} 


We  may  conclude  from  Theorem  5  that  any  program  that  has  n 
loops  and  contains  nonexecutable  paths  can  be  transformed  to  an  equiv- 
alent program  such  that  each  path  of  its  graph  is  executable.   The 
transformations  T6  and  Til  are  applied  as  many  times  as  necessary. 

Lemma  1 


If  TI 


3 


n '  then  n 


(1,2] 


*  IP. 


PROOF: 


Assume  S.  =  A  «-  cpB. . .  .B  . 

l        1    r 


Then  H  =r^   n1  means  that  in  H*  S. 
5  i 


is  replaced  by  S.'   =  C 


CDB.....B  and  all  references  to  A  in  the  scopes 
1    r 


of  S.  in  all  the  paths  that  include  S.  are  replaced  by  references  to  C. 

1  R 

Therefore     TI  ==>  IT  ===>  TI1 
2    11 

where  in  TI  S.  is  replaced  by  S,S! 


S  =  D  -  cpB.  . . .  B 
1 

S!  =  C  -  <PBn... 

l        1 
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and  all  references  to  A  in  the  scopes  of  S.  in  all  the  paths  that 
include  S.  are  replaced  by  references  to  C.  IL  =^>  II  therefore 
n  ===>  II  .   S  is  useless  in  all  the  paths  that  include  S,  therefore 

n1  T=>  n». 

Lemma  2 


If  n 
PROOF 


n'  then  n 


TTJ) 


h  n 


T*  ni 


II '  where  in  IT_  we  insert  S,  n 
1  1+1 


between  S.  .,  and  S. , 
l-l      l 


Tlic  two  statements  S.    are  redundant,  therefore  we  can  apply 


T2  and  get  IT' 


Lemma  3 


If  n  ===^  n ' ,  then  n  TT?)   n ' 


PROOF: 


^  n 


Rx 

We  insert  S.  before  t,  which  is  useless  in  all  the 
i 

paths  that  contain  it.        i 
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The::  wo  apply  TZ   twice  and  we  cet  IT' 


Lemma  - 
PROOF: 


ko 


We  operate  T6  together  with  T7  four  times 


T6,T7 


Wo  will  define  an  enumeration  of  the  paths  of  a  graph  correspond- 
ing to  a  program  schema.   Nonexecutable  paths  will  not  be  enumerated.   The 
definition  of  the  enumeration  will  be  stated  recursively,  as  follows. 


Definition  of  enumeration 

1)  look  at  the  root  v. ,  if  it  is  a  leaf 

v.  to  the  enumeration. 

1 

2)  otherwise 

enumerate  the  left  subgraph 
enumerate  the  right  subgraph, 
nonexecutable  paths  are  not  enumerated. 


-  add  the  path  terminated  by 


P 


1*1 


EXAMPLE  12 


The  enumeration  of  the  paths  (assuming  all  are  executable) 
will  be  as  follows: 
(VVv2) 

(v0,vrv5) 

(v0,vvv6,v10,v12,v9) 

Let  II  and  II'  be  two  programs  in  which  each  path  is  executable, 
with  the  enumerations  I   ,...,l     and  k  , . . . ,k  respectively.   The  enumeration 
induces  a  1-1  correspondence  between  the  paths  of  n  and  H ' : 

1    1'   '  n     n 


k2 


Lemma  [ 

If  IT  and  II '  are  two  programs  in  which  corresponding  paths  have  the  same 

tests  appearing  in  the  same  order,  then  corresponding  paths  are  consistent. 

PROOF: 

We  will  first  show  that  the  trees  T  and  T"  of  IT  and  II'  respec- 
tively are  similar,  i.e.  they  have  the  same  structure.   By  Knuth  (6) 
similarity  can  be  proved  by  showing  that  there  is  a  one-to-one  correspon- 
dence between  the  nodes  of  the  two  trees  which  preserves  the  structure, 
so  that  if  nodes  u,  and  u  in  T  correspond  respectively  to  nodes  u..  '  and 
u  '  in  T"  ,  then  u  is  in  the  left  subtree  of  u  iff  u  '  is  in  the  left 
subtree  of  u  ' ,  and  the  same  holds  for  right  subtrees. 

Let  u,,...,u  and  vn , . . . ,v  ,  be  the  enumerations  of  the  nodes  of 
1      n      1      n 

T  and  T'  respectively  (with  repetitions)  induced  by  the  enumerations  of 

the  paths.   Since  corresponding  paths  have  the  same  tests  n=n'.  We  will 

show  by  induction  that  u.  ,  is  the  left  descendant  of  u.  iff  v.  ,  is 

l+l  i      i+I 

the  left  descendant  of  v.  and  u.  .  is  the  right  descendant  of  u.  iff 

i      i+I  l 

v.  _,  is  the  right  descendant  of  v.  . 
i+I         to  i 

The  case  i=l  is  trivial. 

Assume  we  proved  for  l,2,...,i.   Then  the  subtrees  which  include 

the  nodes  u_,...,u.  and  v..,..., v.  are  similar. 
1      i      1      i 

Let  u    be  the  first  node  for  which  the  theorem  has  not  been 
proven.   There  are  two  possible  cases: 

l)  u.   is  the  left  descendant  of  u. .   Since  all  corresponding  paths  have 
the  same  length,  v.   has  to  be  on  the  same  path  as  v. ,  and  since  the 
left  descendant  is  the  next  node  to  be  enumerated,  v.  _  must  be  the  left 
descendant  of  v. . 


^3 


2)  u.  -  is  the  right  descendant  of  u. .   Since  corresponding  paths  have 

the  same  length  v.  and  v.  .  are  on  the  same  path  in  T' ,  thus  v.  ,  is 
i      l+l  l+l 

a  descendant  of  v. .   It  can  not  be  the  left  descendant  because  we  assumed 

i 

that  the  subtrees  which  include  u, ,...,u.  and  v, , . . . ,v.  are  similar. 

1      i      1      l 

Therefore  v.  ,  is  the  right  descendant  of  v. . 
l+l  l 

Since  the  trees  have  the  same  structure  and  corresponding 
paths  have  the  same  tests  appearing  in  the  same  order,  it  is  clear  that 
corresponding  paths  are  consistent. 

THEOREM  k 

Let  J!  and  n '  be  two  programs  in  which  each  path  is  executable 

with  the  enumerations  k, • . • . ,k  and  k'  ,...,k'  ,  respectively.   Then 

1      n      1     '   m 

using  the  transformations  T1,T5,T7,T8  and  Til,  n  and  IT'  can  be  trans- 
formed to  equivalent  programs  ¥  and  ¥'  respectively  with  the  enumerations 

I.  .....  I     and  I' .....  l'    in  which  each  path  is  executable,  such  that  cor- 

1      q      1      q 

responding  paths  in  ^  and  ^'  are  consistent  and  have  the  same  tests. 

PROOF: 

We  will  assume  that  the  graphs  of  IT  and  IT  *  are  trees.   This 
assumption  does  not  cause  any  difficulty  because  Til  may  be  used  in  reverse 
as  many  times  as  necessary  to  transform  the  graphs  to  trees. 

We  will  construct  two  sequences  of  programs  ¥,¥,...,¥ 
and  *',  *£,...,*■  such  that  tQ   .  n,  !■  -  n',f.  (f^f  *1+1 

f!  ,  ,  r\,      y    y\    .      and  for  all  i,  Ki<q,  the  first  corresponding  i 
l  11,5,  ( ,o  j   l+l  

paths  of  ¥.  and  ^!   are  consistent.  Then  ¥  and  VI"  will  be  the  programs 

¥  and  ¥'  of  the  theorem. 


kU 


Throughout  this  proof  identical  tests  checking  different 
values  are  considered  as  different  tests. 

By  Lemma  5,  if  corresponding  paths  have  the  same  tests  appear- 
ing in  the  same  order,  then  corresponding  paths  are  consistent.   So  in 
each  stage  of  the  process  we  apply  transformations  T1,T5,T7,T8  so  that 
corresponding  paths  will  have  the  same  tests  appearing  in  the  same  order. 
Useless  unconditional  tests  will  be  eliminated  by  T7  before  applying 
the  above  transformations. 

^  and  ¥'  will  be  constructed  as  follows: 

First  we  eliminate  all  useless  unconditional  tests  by  applying 
T7-   Then 

1)  If  the  tests  in  k  and  k'  are  t, , . . . ,t     and  they  appear  in 
the  same  order  in  both  paths,  then  ^  -   II  and  ^'  =  II1. 

2)  If  the  tests  in  both  k  and  k'  appear  in  the  same  order, 

but  there  is  a  test  t.  in  L  that  does  not  appear  in  k ' ,  we  use  Tl 

j     1  1' 

reversed  as  many  times'  as  necessary  to  insert  statements  such  that 

v  (A)  =  v  (A),  and  then  we  use  the  reverse  of  T7,  Til  to  insert  t.(A) 

J 

in  k'  .  We  get  an  equivalent  program. 


T1,T7 

,T11 

4* 

/ 
/ 
/ 
/ 
/ 
t. 

/ 

s 

/ 

N 

/ 

\ 

/ 

/ 

\ 

\ 

*,' 

S 

n* 
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We  operate  the  reverse  of  T1,T7  and  Til  as  many  times  as  nec- 
essary on  both  II  and  IT1  so  that  both  paths  will  have  the  same  tests. 

3)  Assume  the  tests  appear  in  a  different  order  and  let  t.  and  t.  be 
two  adjacent  tests  appearing  in  a  different  order  in  n  and  H ' . 


n 


IT 


We  will  use  the  reverse  of  T5  as  many  times  as  necessary  to  move 
the  statements  S  , . ..,S  after  t.  so  that  T8  can  be  operated. 

(a)  if  t.  appears  on  the  right  branch  of  t .  in  II  *  right  after 

J 

t .  we  will  operate  T8  on   . 

J 

T8 


To  oY  ave      .    and  t.    in  the  sane  order  as   in  II . 
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(b)  Otherwise 


we  will  operate  the  reverse  of  T7  to  insert  t.  on  the  right  branch  of  t. 

(if  v  is  a  STOP  statement,  we  will  insert  t.  before  v  ),  and  then  use 
n  '  1        n  ' 

the  reverse  of  T5  as  many  times  as  necessary  to  move  all  the  assignment 
statement  after  t. .  Now  T8  can  be  applied. 


(c)  If  t.  appears  on  the  right  branch  of  t.  but  there  are 

—  J 

test  statements  between  t.  and  t  .  we  will  use  T7  to  insert  t.  on  the 

1      -J  x 

right  branch  of  t,  and  then  use  T8. 
Now  we  apply  the  same  procedure  as 
in  case  (a). 


The  above  process  will  be  applied  to  each  pair  of  t.  and  t. 
that  appear  in  a  different  order  in  the  two  paths. 


II 


{1 


,5,7,tfV  ' 


y. 


n 


1         N   fi 

{l,5,7,»J   1 


1*7 


h)   The  same  as  in  3)  but  t.  and  t.  are  not  adjacent  in  IT'. 


/ 
t 


t  ,...,t 

1       n 

separate  t . 


and  t . , 

l 


n 


n* 


In  this  case  we  will  use  T7  to  insert  t.  on  the  right  branch 

of  t   and  then  apply  T8.  We  might  have  to  use  T5  to  move  assignment 
n 

statements  on  "both  branches  of  t   so  that  T8  can  be  applied.  We  will 

n 
use  the  same  procedure  as  in  case  3) • 

We  will  repeat  this  process  for  all  t  , . .  .  ,t 

vn     vl 


1*8 


t. 
■i 
/ 


IT 


R 


IT' 


(5 


TW^i 


We 


conclude   from  l)-+)    that  Hfn      "      aJ    ^n      n'      c "      fl)YJ 

(l,5,7,o)   1   (l,5,7,bj  1 


Assume  that  we  have  constructed  1^. . .  .  ,f.    and  ¥'...,¥! 

such  that  for  the  first  i  paths  of  ^ .  and  Y!  -  corresponding  paths  are 

consistent.   We  would  like  to  construct  ¥.  ,  and  ^.  '. 

i+l      i+I 

Let  the  tests  of  I .    ,  and  £.  .  be  tn,...,t  and  t ',..., t' 

i+l      i+l     1      r      1      q 

respectively.   The  cases  l),2)  are  as  before.   For  3)>M  assume  first 
that  t.  is  on  the  left  branch  of  t.  in  ¥., 


¥, 


then  proof  is  as  in  3)  above. 


U9 


If  t .  is  on  the  right  branch  of  t. 
J  i 


i+1 


¥, 


i+1 


f. 


we  observe  that  this  case  is  impossible  because  /.  and  £.'  include  the 

11 

same  tests  appearing  in  the  same  order. 
k)   is  proved  in  the  same  way. 
temark :  We  observe  that  in  ¥  and  ¥'  no  path  has  useless  unconditional 
tests. 


THEOREM  5 


Let  H  and  JI1  be  two  programs  with  the  enumerations  I,,.. 


and  k.,...,k  respectively,  in  which 
i)  each  path  is  executable. 

ii)  corresponding  paths  are  consistent  and  have  the  same  tests, 
iii)  all  the  statements  of  the  programs  appear  on  the  graph  (i.e. 
there  are  no  unreferenced  block 
iv)  no  path  has  useless  unconditional  tests. 
Then 

v  i,  i<i<„,  h(n)-h{n<)    iff  n  ^^^  '  . 


,1 
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D^n)   denotes  the  dag  Dj.00     1  <  i  <  n,   and  D^TT')   denotes 

the  dag  D^   (IT),      1  <  i  <  n. 
i 

PROOF : 

1)  The  "if"  part  is  simple  and  was  discussed  when  we 
listed  the  transformations. 

2)  We  will  show  that  if  £,,...,£  are  the  paths  of  IT, 

k  , . . .  ,k  are  the  paths  of  V ,  and  V  1  <  i  <  n  D.  (n)  =  D.  (IT)  then 

n         yip. 
{3,4,5,10,11} 

If  the  dags  are  identical  then  the  input  sets  of  IT  and  IT' 

are  identical  because  the  inputs  are  the  leaves  of  the  dags. 
I.  t. 

Let  Snx S     be  the  sequence  of  assignment  statements  of  path  £. 

1  '         '  XLa  i 

ki      k. 
in  program  IT.   T,  ,...,Tnx  is  the  sequence  of  assignment  statements 

1  '      i 
of  path  kj  in  program  IT'.   The  two  sequences  have  the  same  length 

because  each  statement  corresponds  to  a  node  of  D. ,  and  we  assumed 

that  the  dags  are  identical. 

Renumber  the  assignment  statements  S,  ,...,S  of  IT  by 

eliminating  repetitions  from  the  sequence  S  , . . . ,Sr,. . ,  S,  , . . . ,Sr 

We  shall  define  inductively  a  sequence  of  programs  JTn,  11-1, . . .  ,H  such 

# 
that  IL,  =  n'  tt  =  n  and  IT.  >•  II .  , .   Assume  inductively 

°     '  *■»        J  {3,^,5,10,11}   J+1 

that  1)  n'  >  IT.,  2)    corresponding  paths  in  T  and  IT. 

{3,^,5,10,11}   J  J 

are  consistent  and  have  the  same  tests,   3)  If  U? ,  lT;,...,Lr  are  the 
assignment  statements  of  IT.  numbered  without  repetitions  using  the 
same  method  as  was  used  to  number  the  statements  S,,...,S  of  E,  then 

ul  =  sl>  u2  =  s2>---'uj  =  sy 
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Clearly  the  assumptions  1-3  are  valid  for  j=0. 

Let  TT.  be  given  and  assume  S.  ,  is  A  *-  cp  A, .  .  .  A,  and  lies 
J  0+1         1    k 

on  path  /..   Because  D.  (IT.)  =  D.  (Tl)  there  is  an  assignment  statement 

UJ  =  B  «-  cp  A.,  .  .  .  A  that  has  the  same  node  in  D.  (n.)  that  S.  ,  has  in 
v        1     x  i  J       J+l 

D.  (n).   We  see  that  v>j  because  Ir  is  not  in  the  sequence  U^,...,U.. 
None  of  the  A,,..., A,  is  defined  by  UJ.  .,..., JJr   because  otherwise 
D.  (n)  would  not  be  equal  to  D.  (JI.). 

Let  S  be  the  last  statement  in  path  i.  for  which  the 
corresponding  statement  in  H.  appears  in  the  same  position.   Since  the 

J 

enumeration  is  without  repetitions  S  may  be  different  from  S.. 
We  will  distinguish  between  two  cases: 

(a)  in  TI.S.  ..  is  the  next  statement  after  S^. 

9   J+l  r 

(b)  S  .  n  is  the  next  assignment  statement  but  there  are 

J+l 

one  or  more  test  statements  between  S_  and  S.,,. 

r      j+l 

S.,  might  be  null  in  the  initial  step  of  II,  =71'. 


(a)  S.,,  follows  S  in  II. 
'      .1+  r 


The  following  cases  are  possible  for  IT. 

l)  There  is  no  test  between  IT  =  S  and  IT  In  II.,  and  there 

'  r    r      v     y 

is  no  path  that  includes  UJ  but  not  U  . 

v         r 

We  can  apply  TU  as  many  times  as  necessary 
to  move  U  right  after  Sr.   T^  can  be 
applied  because  we  proved  that  none  of 
^....A^is  defined  by  U"j+1,...  ,xfi . 
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By  T3  we  change  B  to  A.   We  get  a  program  II.   ,  II.      >n. 

2)  In  IT.  there  is  one  or  more  tests  between  S  =  IT  and 
3  r    r 

Lr,  but  there  is  no  path  that  includes  lr1  but  not  UJ . 

v         r 

(i)   If  UJ  is  on  the  left  path  of  the 

last  test  that  separates  UJ  and  Lr"(£.) 

r      vi 

then  it  must  be  also  on  the  right  path 
of  this  test  (i  )  because  the  corresponding 
path  of  £  in  II  contains  S .  n  and  the  dags 
of  II  and  II .  are  equal.   By  applications 

J 

of  Tk   and  T5  several  times  we  can  move  lr 

v 

to  be  right  after  U  ,   using  the  argument  of 
l).   By  T3  we  change  B  to  A. 

(ii)   If  U  is  on  the  right  path  of  the  last  test  that  separates  U 
and  lr ,  then  by  the  above  argument  there  must  be  a  statement  Lr  =  lr 
on  the  left  path  of  that  test.   But  because  the  left  path  precedes  the 
right  path  in  the  enumeration  and  lr  =  S  ,...,lr  =  S  ,    this  case  is 


impossible. 


3)  There  is  a  path  that  contains  U  and  does  not  contain 


S  ,  and  there  is  no  test  between  S  and  U  . 
t'  r      v 


(i)   If  S  is  on  the  path  left  of  the 
merging  point,  we  use  Th   to  move  U^  to 
the  merging  point  and  then  use  the  reverse 
of  T5,  and  again  Tk.       Increasing  the  number 
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of  statements  by  using  the  reverse  of  T5  does  not  cause  any  difficulty, 
because  the  number  of  statements  along  all  the  paths  is  equal  in  II  and 


Ik) 


(ii)  The  case  in  which  3  is  on  the 

r 

path  right  of  the  merging  point  is 
impossible  because  we  proved  already  for 
the  left  path  i ' ,  therefore  IT  has  the 
form 


9B....B, 


cp  B, . . .  B 
1    r 


and  there  must  be  another  statement  Lr  in  U .  that  gives  rise  to  a  node 

*         J 

labeled  by  cp  with  descendants  B,  ,...,B  .   This  statement  must  be 

before  the  merging  point,  otherwise  we  apply  the  same  argument.   We 

use  the  same  procedure  as  in  l). 

(b)  In  II  there  are  one  or  more  test  statements  between  S  and  S.  , . 

r      j+l 

Assume  there  is  one  test  t  between  S 

r 

and  S.  . .   Because  we  assumed  that 
corresponding  paths  in  IT  *  and  II.  are 

J 

consistent  and  have  the  same  tests, 


5U 


and  also  corresponding  paths  in  II1  and  II  are  consistent  and  have  the 
same  tests,  also  corresponding  paths  in  IT  and  IT.  are  consistent  and 

J 

have  the  same  tests.   Therefore  in  n.  there  must  be  a  test  statement 

3 

t  checking  the  same  values  and  there  is  no  test  statement  between 

UJ  =  S  and  t. 
r    r 

Vie   use  T5  reverse  as  many  times  as 
necessary,  so  in  II.  ,  t  will  appear 
right  after  S  . 


R 


{5} 


Then  we  proceed  with  similar  arguments  to  those  in  case  (a).   We  use 

similar  arguments  when  there  is  more  than  one  test  between  S,,  and  S.,,. 

r      j+i 

We  might  conclude  from  (a)  and  (b)  that  the  desired  program 
II.  ..  is  obtained.   In  each  stage  we  might  have  to  flip  blocks  by  T10 


so  that  the  blocks  will  appear  in  the  same  order  in  IT  and  II 


j+l' 


n 


¥   n. 


J  {3,^,5,10,11}   J+i 


thus  IT ' 


P-    IT. 


{5^,5,10,11}    J'+l 


and  because  none  of  these  transformations  operate  on  tests,  correspond- 
ing paths  in  H'  and  IT.  ,  are  consistent  and  have  the  same  tests. 
DEFINITION 

A  program  is  reduced  if  no  executable  path  of  the  program  has 
useless  or  redundant  statements  and  the  program  has  no  unreferenced 
blocks. 
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The  dags  of  a  reduced  program  have  the  following  properties: 

1)  No  dag  in  D(ll)  has  roots  which  are  not  distinguished. 

2)  No  two  nodes  in  a  dag  have  identical  direct  descendants. 
THEOREM  ( 

Any  program  II  can  be  transformed  to  an  equivalent  reduced 
program  IT'  using  no  larger  set  of  transformations  than  ( 1,2,'+, 5, 9, 11}. 

The  following  example  shows  how  the  transformations  are  used 
to  reduce  a  given  program. 
EXAMPLE  13 

Let  II  be  TI  =  (G  ,  {B,C},  {L,N}) 


STOP 


STOP 


The  statement  N  *■  cpAL  is  useless  in  the  right  path, 
remove  the  statement  we  first  use  the  reverse  of  T5 


To 


R 
(5^ 


N  «-  cpAL 
L  -  CpAN 
IT  •-  q)BC 

STOP 


A  «-  qpBC 
L  -  cpAC 


t(A) 


STOP 
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now  Tl  might  be  used  to  remove  the  useless  statement 

A  «-  cpBC 
L  «-  cpAC 


CD 


t(A) 


N  «-  cpAL 
L  «-  cpAN 
N  -  cpBC 

STO 


N  *-   cpBA 


STOP 


The  obtained  program  is  reduced. 

PROOF: 

]£ 

Assume  S.  :A  *-  cpB, . . .  B  is  a  useless  statement  in  an 
1        1    r 

executable  path  k  of  the  program  II. 

(a)  If  S.  is  useless  in  all  paths  containing  it,  then  we 
operate  Tl  to  remove  S-. 

(b)  If  there  exists  at  least  one  path  l   such  that  Sj_  is  not 


useless  in  I,   let  S-  =  t(C,,...,C  )  be  the  first  test  statement  after 

7  J      1      q 


S  .^(C^,.  .  .  ,0^  ) 


i)  Assume  there  is  no  path  that  includes 

k   k 
t(C, ,...,C  )  and  does  not  include  S.  -S. 

v  1'        '    q/  11 

is  useless  in  path  k_,  so  A  is  not 

k       } 
referenced  by  any  statement    S.   ,...,S_ 

Thus  we  might  use  Tl|  as  many  times  as 


necessary  so  that  S.  appears  before  the  test  statement.   Now  we  use  the 


reverse  of  T5-   We  get 


A  =  cpB  . 


cpBn 
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ii)  Assume  there  is  a  path  that  includes  t(C1,...,C  )  and  does  not 


include  S. . 

1 


t \C,  , . . . ,C  ; 


We  operate  Til  in  reverse  and  proceed  as  in  case  i).  We 
repeat  the  process  of  i)  and  ii)  above  as  many  times  as  necessary  till 
S-  will  be  useless  in  all  paths  that  include  it  and  Tl  can  be  applied. 


Assume  S.  :  A 

1 


cpD.. . . .  D  and  S  .  :  B  *-  cpDn .  . .  D  are  redundant 
1    r      j        1    r 


in  path  k.   If  there  is  no  path  that  includes  S .  and  does  not  include 

k  /      / 

S.  and  also  there  is  no  path  £  that  includes  S.  and  S.  but  they  are 
i  i      J        J 

not  redundant  in  that  path,  we  can 

v 
operate  T2  and  remove  S .  . 


If  there  is  a  path  that  includes  S  .  and  does  not  include  SV 

J  i 


we  operate  Til  in  reverse 


and  now  T2  can  be  applied. 
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Assume  there  is  a  path  I  that  includes  S.  and  S.  and  they  are 


not  redundant  in  that  path 


then  we  again  operate  Til  in  reverse 


and  now  T2  can  be  applied  to  remove  S .  from  path  k. 

J 

Unreferenced  "blocks   can  be  removed  by  applying  T9-      So 


n 


¥    IT. 


(1,2,4,5,9,113 

Lemma  6 

Let  IT  be  a  program  in  which  each  path  is  executable  and  E  is  a  well- 

formed  expression  over  8  and  Z  such  that  E  is  in  v  (ll)  for  some  path 

I   or  E  =  v  (C.)  where  t  (C,,...,C  )  is  a  test  statement  in  path  £. 

E  /  Z.   Let  E  be  a  well-formed  expression,  E-./Z,  E.  is  a  subexpression 

of  E,  then  there  is  a  statement  A  «-  cpA,  ...A,  in  path  l   such  that  the 

value  of  A  computed  at  that  time  is  En . 
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PROOF: 

Because  of  the  properties  of  well-formed  expressions,  there 
is  a  unique  way  to  write  an  expression  E  as  cpE.....EL  where  E,  ,...,E, 
are  subexpressions.   The  proof  follows  from  the  definition  of  the 
value  of  a  variable. 
Lemma 

Let  II  and  IT'  be  two  equivalent,  reduced  programs  in  which  all  paths 
are  executable  and  corresponding  paths  are  consistent  and  have  the 
same  tests, and  let  I   and  k  be  two  corresponding  paths  of  TT  and  TI1 
respectively. 
Then  the  assignment  statements  of  I   and  k  are  in  one-to-one 


correspond,  then  v.(A)  =  v  (b). 


1-.  ■      -.uk! 

J  r 

//,n    k, 


correspondence  such  that  if  S  .  :  A  *■  cpA-. .  .A  and  U  :  B  «-  YB, . .  .  B 


PROOF; 


Let  S  .  :  A  *-  cpA_...A  be  an  assignment  statement  of  l   such 


that  v.(A)  =  E.   n  is  reduced,  therefore  one  of  the  following  two 

J 

conditions  must  hold. 

1)  E  is  a  subexpression  of  an  expression  in  v  (n).   In 

/       k 

this  case  since  I   and  k  are  consistent  and  IT  =  IT*,  v  (n)  =  v  (TI')- 

Therefore  E  is  also  a  subexpression  of  an  expression  in  v  (TI')- 

2)  E  is  a  subexpression  of  an  expression  v  (C,  )  such  that 

t  (C ,C  )  is  a  test  statement  in  path  l.      In  this  case  since  I   and 

v  1'    '   n 

k  are  consistent,  corresponding,  and  have  the  same  tests, there  exists 
a  test  statement  tk(D1,. .  .  ,D  )  such  that  ^(Dj)  =  v/(Ci).   Thus  E  is 
also  a  subexpression  of  v  (D. ). 


6o 


By  Lemma  6  there  is  at  least  one  q  such  that  v  (B)  =  E. 
We  will  prove  that  q  is  unique.   Suppose  there  are  q1  and  a     for  some 
j.   (if  there  is  more  than  one  such  j  select  the  first  one  and  if  for 
that  j  there  are  more  than  two  q's,  select  the  first  two.)  Let 

U*  :B-¥.r   .B 
q-j_       1  1    r 

U*  :  C  *-  ^nCn .  .  .  C 
q2       2  1    r 

vk  (B)  =  vk  (C)   thus   v   =  ¥0   and  vk  (B.)  =  vk  (c).  Ki<r. 
q^      q2  1    2        qx  i'    q2v  i'   -  - 

Because  of  the  selection  of  j,  q,  and  q^  B.  =  C.   1  <  i  <  r.   Thus  we 
can  apply  T2  to  n',  and  this  contradicts  the  assumption  that  IT1  is 
reduced. 
THEOREM  7 

Let  IT  and  IT'  be  two  reduced  programs  in  which 

1)  each  path  is  executable 

2)  corresponding  paths  are  consistent  and  have  the  same  tests. 
Then 

n  =  it   iff      D.(n)  =d.(it)  Vi,  1  <  i  <  n 

i     i      '  —  — 

H^  .. .  .  ,1     and  kn  ..  . .  ,k  are  the  enumerations  of  II  and  IT'  respectively. 
1'     n     1     n 

PROOF: 

1)  If  for  all  i,  1  <  i  <  n  D.  (IT)  =  D.  (IT),  by  theorem  5 

II  y  ^ '  and  because  transformations  preserve  equivalence 

(3,^,5,10,11) 

tt  =  ir1. 

2)  If  TT  =  IT'  and  there  is  at  least  one  i  for  which 
Di(n)  ^  Di(TT*),  there  is  a  node  N  in  Di(n)  which  is  not  included  in 
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D.(n'),  but  all  its  descendants  are  present,  (simplest  case  -  where 

the  descendants  are  the  input  variables  which  are  the  same  for  both 

programs. )  This  contradicts  Lemma  7  because  we  showed  that  the 

assignment  statements  are  in  one-to-one  correspondence 

A«-cpA,  ...A   4 — >  B«-¥B,...B  such  that  vX(  A)  =v?"(B).   Thus  qp  =  ¥ 
1    r  lr  j       k   ' 

and  v.(A  )  =  v"^(B  )  1  <  q  <  r.   Since  each  node  of  the  dag  is 
J   q     3      q     "   - 

associated  with  an  assignment  statement,  the  contradiction  follows. 
THEOREM  8 

Let  TT  and  TT '  be  two  loop-free  programs.   Then 

n  =  it   iff  n  y  n' . 

(1,2,6,7,9,10,11) 

PROOF: 

1)  "if"  is  trivial  because  transformations  preserve 
equivalence. 

2)  By  Theorem  3 

(*)      n    ,  *    ml         n'      *    >  n: 
{6,11}    1  {6,11}      1 

where  :!,  ,  ILJ  are  programs  in  which  each  path  is  executable. 

i^  =  n  s  n'  =  rL]  . 

By  Theorem  k 

1  {1,5,7,8,11}  1   {1,5,7,8,11} 

such  that  corresponding  paths  in  n  and  n*  are  consistent  and  have  the 
same  tests  and  each  path  in  fl  and  II'  is  executable.   By  lemmas  3  and  k 


*      X  {1,2,6,7,11}   2      {1,2,6,7,11}   2 


~  ee  n^  therefore  ~  = 
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We  transform  IT  and  n'  to  equivalent  reduced  programs  TT 
d  2  5 

and  TT',  respectively.   By  Theorem  6 
3 


2  {l,2,l+,5,9,ll}   3      2  (1,2, U, 5,9,11)   5 
By  Lemmas  2,  3 

■X-  X- 

2  (1,2,9,11^  3     2  (1,2,9,11}   ^ 

IT  -  np  thus  n^  ~  n^  * 

By  Theorem  7  since  TT,  and  TT '  are  reduced  programs  in  which 

3      3 

each  path  is  executable  and  corresponding  paths  are  consistent  and 

have  the  same  tests 

Vl.   1  <  i  <  n  d.(ttJ  =  D.(n;) 
'   —   —    i  3    x  3 

v/here    i.  ....  .1     and  kn ,. . .  ,k     are  the  enumerations  of  IT,  and  TT' 

1'   '  n     J.     n  3     3 

respectively. 

■k- 
By  Theorem  5  IT,  >  TT'  . 

3  {3,^,5,10,11}    5 

By  Lemmas  1-3  TT,         =#>  TT'  . 
3  {1,2,10,11}'   3 


By  using  (***)  IT, 


^  TT' 


2  {1,2,9,10,11}   2 

By  using  (**)  IT         *   >  JL" 

1  [1,2,6,7,9,10,111    1 

and  by  using  (*)  II  ^>-  TI '  . 

{1,2,6,7,9,10,11} 

This  shows  that  {T1,T2,T6,T7,T9,T10,T11}  form  a  complete  set  of 

transformations. 
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-■::::  :  :  .: 

If  J  is  a  complete  set  of  transformations,  but  no  proper 
subset  of  J  is  complete,  then  J  is  called  an  irredundant  set  of 
transformations. 
7  I   -..  ' " 

J  =  (T1,T2,T6,T7,T9,T10,T11)  form  an  irredundant  set  of 
transformations. 
PROOF: 

By  Theorem  8  the  set  J  is  complete.   We  will  prove  the 

theorem  by  showing  that  each  of  the  transformations  in  J  has  to  be  in 

the  irredundant  set. 

11 
Let  S  be  the  set  {T.} 

J   0=1 

a)  For  showing  that  Tl  has  to  be  in  J  we  will  use  Aho's  and 

Ullman's  example  (l). 

nx  =  (P,  CB,C),  A)  il[  =  (P1,  (B,C),  A) 

P:    A  «-  ¥BC  P1:   A  <-  cpBC 

A  *■  cpBC  STOP 

STOP 

IL  =£«•  II'  ,  but  there  is  no  sequence  of  transformations 

from  S  -  {Tl}  that  can  get  rid  of  the  first  statement.   The  only 

transformation  that  can  remove  an  assignment  statement  from  an 

executable  path  is  T2,  and  T2  eliminates  redundant  statement.   Here 

the  expressions  ¥BC  and  cpBC  are  different  and  therefore  are  irredundant. 

Adding  new  paths  to  the  program  by  T7  and  Til  will  not  help  because 

there  will  still  be  at  least  one  path  that  contains  the  useless 

statement. 


6k 


b)  We  will  show  that  T2  is  in  J  by  using  again  an  example 
of  Ullman  and  Aho  (l). 


n2  =  (P,  {C,D},  {A,B}) 


A 

«-  cpCD 

B 

«-  ¥AA 

A 

«-  cpCD 

STOP 

n^  =  (P«,  (C,D),  {A,B}) 

P:    A  «-  cpCD  P'  :   A  «-  cpCD 

B  «-  ¥AA 
STOP 

TI     >  TT '  ,  but  no  combination  of  transformations  from 
S  -  {T2}  can  get  rid  of  the  third  statement.   That  is  because  none  of 
the  statements  is  useless,  and  therefore  Tl  can  not  be  applied.   Also, 
if  we  introduce  new  paths  to  the  program  by  T7  and  Til  there  will  still 
be  at  least  one  path  that  includes  the  redundant  statements.   Thus 

2  S  -  {T2}     2 


c) 
t(B)3,4 

3A  «-  cpBCy       \  kA   «-  ¥BC 

STOP 


t(Bh,h 


STOP 


8D  *-  ^AA 


STOP 


n. 


n6 


No  combination  of  transformations  of  S  -  {T6}  can  change  the  conditional 
test  t(B)7,8  in  TL-  to  an  unconditional  test  and  thus  eliminate  the 
nonexecutable  path.   That  is  because  none  of  the  transformations 
operate  on  transfer  addresses  of  tests  and  decide  if  a  conditional 
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test  can  be  made  unconditional.   Introducing  new  paths  to  the 

program  by  T7  and  Til  will  not  help,  since  the  above  path  will  remain 

nonexecutable. 

d)   Let  IL  be  t(A)7,7  B  «-  <pCD 

T7 


7  B  «-  cpCD 
STOP 


STOP 


We  will  show  that  no  combination  of  transformations  from 
S  -  fT7)  can  eliminate  the  useless  test. 

If  we  look  at  the  transformations  that  operate  on  tests- 
T6,  T8,  Til  then  if  we  use  Til  reverse  we  get 

t(A)7,7 


t(A)7,8 


Til 


7B  *-   CpCD 


B  «-  CpCD 


STOP 


STOP 


8B  -  cpCD 


STOP 


and  t(A)  can  not  be  eliminated  from  the  obtained  program,  only  by 

using  Til  again  and  T7.   Introducing  other  new  paths  can  only  be  done 

by  using  T7  and  Til,  or  by  introducing  nonexecutable  paths  by  T6 

reverse.   If  we  introduce  a  test  by  T6  reverse 

t(A)3,l* 
R      3B  *■  cpCD 


{6} 


STOP 


B  -  CpCD 


ST01 
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We  can  eliminate  the  original  t(A)  by  using  again  T6  but  we  have  now  a 
new  test  that  can  not  be  eliminated.  T8  operates  on  tests  but  can  not 
eliminate  useless  tests. 

e)  For  the  case  of  T9,  removing  of  unreferenced  blocks,  we 
observe  that  T1-T5  manipulate  assignment  statements  and  do  not  operate 
on  the  whole  block,  T6-T8  operate  on  tests,  T10  flip  blocks  and  Til 
merge  identical  subgraphs,  but  cannot  remove  unreferenced  blocks  because 
they  do  not  appear  on  the  graph.   Therefore  no  sequence  of  transformations 
from  S  -  {T9}  can  get  rid  of  unreferenced  blocks. 

f)  t(A)  1,2  t(A)  1,2 
1   A  -  cpBC                   2   A  <-  ¥BC  . 

STOP  STOP 


2   A  *-  YBC 
STOP 


T10 


1   A  <-  cpBC 
STOP 


n 


10 


Ko 


No  sequence  of  transformations  from  S  -  {  T10)  can  transform 
!!,„  to  IT'  .   That  is  because  no  sequence  of  transformations  can  change 
the  relative  positions  of  different  blocks  in  the  program.   T'+  change 
the  relative  order  of  assignment  statements  but  not  of  blocks.   Also  Til 
cannot  merge  these  blocks  because  they  are  not  equal.   Introducing  new 
tests  will  not  help  since  the  transfer  addresses  of  different  tests 
might  be  changed  but  the  blocks  themselves  will  not  be  removed. 
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1  STOP 


t(A)l,2 


2  STOP 


TT 


11 


t(A) 


Til 


STOP 


No  sequence  of  transformations  from  S  -   {Til}  can  merge  these 
subgraphs.   That  is  because  we  have  a  STOP  statement  and  not  an  assign- 
ment statement  and  therefore  T5  cannot  be  used.   Introducing  new  paths 
can  only  be  done  by  Til,  so  this  program  cannot  expand. 

Theorem  9  showed  that  the  complete  set  of  transformations 
found  by  Theorem  8,  is  also  irredundant. 
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5 .   OPTIMIZATION 

We  would  like  to  provide  a  scheme  for  optimization  in  which  a 
sequence  of  the  transformations  is  applied  for  getting  an  optimal  code. 

DEFINITION 

We  say  that  a  cost  function  on  programs  is  reasonable  provided 
that      (l)  the  cost  of  a  program  decreases  if  statements  which  are 
never  executed  are  deleted  from  the  program  in  such  a  way  that  state- 
ments are  not  added  to  the  program. 

(2)  the  cost  of  a  program  decreases  if  a  statement  is  deleted 
from  some  executable  sequence  of  the  program. 

(3)  the  cost  of  a  program  decreases  if  identical  subgraphs 

and  identical  statements  can  be  merged  in  such  a  way  that  test  statements 
are  not  added  to  the  program. 

If  we  consider  a  cost  function  which  is  some  combination  of 
speed  and  size  of  programs  then 

(l)  reduces  the  size  of  the  program.   When  the  graph  of  the  program  is 
a  tree,  eliminating  nonexecutable  paths  always  reduces  the  size  of  the 
program.  When  the  graph  is  not  a  tree-  Til  reverse  should  be  used  in 
some  cases  to  copy  subgraphs  so  that  nonexecutable  paths  may  be  elim- 
inated (see  Theorem  2)  although  sometimes  the  subgraphs  can  be  merged 
again  after  nonexecutable  paths  have  been  eliminated  by  another  appli- 
cation of  Til.  (l)  includes  also  removing  unreferenced  blocks  which  always 
reduces  the  size  of  the  program. (2)  increases  the  speed  of  the  program 
and  decreases  the  size  in  some  cases,  although  the  size  in  other  cases 
might  be  increased.   The  size  of  the  program  might  increase  when  its 
graph  is  a  tree,  if  T5  is  used.   When  the  graph  is  not  a  tree-  Til  must 
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be  used  in  some  cases  to  copy  subgraphs  so  that  statements  can  be  deleted 
(see  Theorem  6).   As  in  (l)  in  some  cases  subgraphs  can  be  merged  again 
after  statements  have  been  deleted.   (3)  reduces  the  size  of  the  program. 

We  may  conclude  that  we  consider  a  combination  of  speed  and  size 
as  reasonable,  in  which  the  size  may  be  increased  in  order  that  the  pro- 
gram will  run  faster,  but  otherwise  the  size  is  decreased  whenever  possible. 

Other  criteria  of  cost  than  speed  and  size  can  be  introduced  pro- 
vided they  do  not  conflict  with  (l)-(3)  above.   For  example  reducing  the 
frequency  of  storing  and  recovering  the  partial  results  in  computing 
arithmetic  expressions  is  a  reasonable  cost  criterion. 

An  optimal  program  for  a  given  program  TI,  will  be  a  program 
equivalent  to  TI  whose  cost  can  not  be  decreased. 

Every  optimal  program  is  reduced  (by  (l)  and  (2)  above)  and  if 
the  graph  of  the  program  is  a  tree-  each  path  of  an  optimal  program  is 
executable  (by  (l)). 

The  following  theorem  will  give  a  general  procedure  for  optimiz- 
ing programs  with  no  loops  under  reasonable  cost  functions,  assuming  the 
cost  function  is  independent  of  the  costs  of  conditional  tests  in  the 
program.   Reducing  the  number  of  conditional  tests  in  the  program  can  be 
done  by  reordering  the  tests.  Reordering  the  tests  also  might  minimize 
the  average  time  through  them  when  more  information  is  given  on  the  costs 
of  different  tests  and  the  probabilities  of  their  outcome.   This  type  of 
optimization  involves  using  T6,  T7,  Til  to  add  new  tests  and  nonexecutable 
paths  to  the  program  and  to  eliminate  other  tests  that  become  useless  and 
paths  that  become  nonexecutable.   There  is  some  difficulty  in  finding  an 
efficient  procedure  for  implementing  this  process  using  the  transformations. 
Also  if  different  sequences  of  tests  result  in  executing  different  sequences 
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of  assignment  statements  in  the  program,  reordering  of  tests  will  not 
reduce  the  number  of  tests  in  the  program.  Therefore  generally  this  type 
of  optimization  does  not  improve  the  code  generated. 

Although  the  optimization  procedure  of  Theorem  10  does  not 
minimize  the  number  of  tests  in  the  program,  it  merges  conditional  tests 
appearing  in  identical  subgraphs  and  it  eliminates  conditional  branches 
to  nonexecutable  paths. 

THEOREM  10 

Assuming  the  cost  function  is  independent  of  the  costs  of  con- 
ditional tests  in  the  program,  there  is  an  algorithm  that  finds  an  optimal 
program  II '  equivalent  to  a  given  program  II.   The  algorithm  operates  in  a 
series  of  steps  so  that  II  is  first  transformed  to  an  equivalent  reduced 
program  II  which  is  independent  of  the  specific  cost  function  used. 
Using  T8  repeatedly  II   is  nondeterministically  transformed  to  a  program 
II".   n'  is  obtained  by  operating  {T1,T3,T^,T5,T7,T11}  onll". 

PROOF: 

We  first  assume  that  all  nonexecutable  paths  can  be  eliminated 
from  II  in  such  a  way  that  statements  are  not  added  to  the  program.   Thus 
in  every  optimal  program  II1  each  path  is  executable.   Also  every  optimal 
program  is  reduced.   So  take  H  to  be  a  reduced  program  equivalent  to  It 
in  which  each  path  is  executable  and  no  two  statements  in  a  path  define 
the  same  variable.   We  can  obtain  II  by  first  eliminating  nonexecutable 
paths  by  T6  and  Til  (Th.3)  then  reducing  the  obtained  program  by  operat- 
ing T1,T2,T*<-,T5,T9,T11  (Th.6)  and  then  renaming  variables  by  T3  such  that 
no  two  statements  in  a  path  of  II,  define  the  same  variable.   Thus 


n 


1,2,3,^5,6,9,13  Kl" 
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H  and  n '  are  reduced  programs  in  which  each  path  is  executable  and 
II   ■  II,  H*  ■  TI,  therefore  IT  =   IP.   Using  Th.^  we  can  find  two  pro- 

grams  1  m*!',^     ,1>5^,8,llf  *'  "'   (1,5,7,8,11?  *'  suoh  that 
corresponding  paths  in  ¥  and  ^ '  are  consistent  and  have  the  same  tests. 
II  =  11'  therefore  ¥  =  ¥  * .  By  looking  at  the  algorithm  of  Th.  k   we  observe 
that  if  IT  and  II '  are  reduced  programs  ¥  and  ¥'  are  also  reduced.   This 
is  because  the  algorithm  does  not  add  useless  statements  and  unrefer- 
enced blocks  and  also  statements  can  be  added  to  the  programs  in  such  a 
way  that  there  are  no  redundant  statements.   By  Th.  7  for  all 
i,DtCf)  .  D.m  and  *  Th.5  *  (;  ^^  »  *'.  ^^J^) 

*  [3,^,5,lO,lir  *'  {1,5,7,8,11?  n'  thUS  El  d,3>, 5, 7,8,10, ll?n'- 

Because  removing  of  unconditional  useless  test  statements  might 
expose  pairs  of  tests  that  can  be  flipped  by  T8,  we  must  operate  T7 
in  the  forward  direction  before  operating  T8.   Starting  with  the  program  TI 
we  apply  T10  to  flip  blocks  and  place  after  each  unconditional  test 
t(A1,...,A  )k,k  in  which  k  references  the  first  statement  of  a  block, 
the  block  referenced  by  k.  This  allows  us  to  use  T7  to  remove  all 
unnecessary  unconditional  tests.  Next  we  apply  TI  to  remove  any  assign- 
ment statement  which  define  variables  that  are  referenced  only  by  the 
unconditional  tests  Just  removed.  The  resulting  program  is  called  II  . 


n 


1  [1,1,10)*    Kc  and  :"c  {l,3,Js5,7,e,10,ll?  n' 


All  the  steps  in  going  from  TI  to  II  were  deterministic.  TI 

c  c 

is  independent  of  the  given  cost  function. 

Because  flipping  of  tests  might  expose  identical  subgraphs  and 
also  might  make  merging  of  identical  statements  possible,  we  will  first 
operate  T8.  We  will  operate  T8  as  many  times  as  necessary  so  that 
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identical  statements  can  be  merged  later,  and  also  that  identical  sub- 
graphs can  be  merged  in  such  a  way  that  the  cost  of  the  program  is  not 
■eased.   We  obtain  a  program  II"  such  that  H  =^H". 

Because  applying  T8  does  not  change  the  number  of  statements 
of  the  program,  II "  can  be  found  in  a  finite  number  of  steps,  although 
a  heuristic  procedure  might  help  to  find  II "  quickly.   Also  different 
cost  functions  will  cause  different  II"  to  be  found.   So  an  algorithm 
(or  heuristic)  based  on  the  nature  of  the  specific  cost  function  might 
be  useful. 


D  *-  ^CE 


STOP        STOP   STOP 


the  statement  D  *-  cpAC  might  be  translated  into  the  following  machine 
instructions : 

a  -»  ace 

ace  +  c  -*  ace 

ace  -*  d 
where  a,c,d  are  the  addresses  of  A,C,D  respectively  and  the  interpretation 
In  (cpAC)  -  +AC.   t(A)  might  be  translated  into: 

a  -*  ace 

t(acc) 
thus  if  we  flip  the  tests  t(A)  and  t(B)  we  can  save  one  of  the  instructions 
a  -»  ace.   Therefore  flipping  of  tests  may  improve  the  machine  code  obtained 
for  some  given  cost  functions. 
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Because  H  , ,  .  ■    .  «  ,  .  , ,  ^  IT '  and  we  used  T8  to  obtai- 
c  11,3,^,5,7,0,10,117 

it  follows  that  "."  r.    -,  1  ■_-,.,,  ,=f^  ['.  All  applications  of  To  can  be 

11,3,4,5,  .   . Llf 

performed  before  applying  {T1,T3,T1+,T5,T7,T10,T11}  .  That  is  because 
T3 , T^ , T10 , Til  and  T5  in  the  forward  direction  do  not  expose  new  tests 
that  can  be  flipped  and  all  operations  of  T7  and  Tl  have  been  performed 
except  those  resulting  from  merging  of  paths. 

It  is  easy  to  see  that  T5  and  Til  can  be  operated  independently 
on  II"  in  the  sense  that  commuting  the  transformations  will  result  in 
programs  having  the  same  cost.  That  is  becuase  if  the  subgraphs  are 
identical  and  the  two  identical  statements  to  be  merged  are  included  in 
both  subgraphs,  the  resulting  program  is  the  same  if  we  first  merge  the 
identical  statements  and  later  merge  the  identical  subgraphs,  or  if  we 
first  merge  the  identical  subgraphs  and  then  merge  the  identical  state- 
ments.  If  only  one  statement  of  the  two  identical  statements  to  be 
merged  is  included  in  the  identical  subgraphs: 


A*<PB.  .  \  B 
r 


if  we  first  merge  the  identical  statements  and  then  the  identical  subgraphs 
we  get 


7* 


A 


(5,11) 


cpB  .  ..B 

1         c 


and  if  we   first  merge  the   subgraphs  we  get 


(ii; 


* 


A  «-  CpB    .  .  .  B 
1  r. 


which  has  the  same  cost  as  the  program  above.  By  applying  T5  twice  we 
can  get  the  same  result  as  above. 


A^-cpBj^ 


Thus  we  first  operate  T5  on  TT"  as  many  times  as  necessary  to 
merge  identical  statements  (  we  may  have  to  use  T3  and  T^  to  be  able  to 
do  iO  and  then  we  use  Til  to  merge  identical  subgraphs.   As  was  shown 
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above  further  applications  of  T5  might  be  used  to  get  other  programs  with 
the  same  cost.  Til  will  be  operated  only  in  those  cases  in  which  identi- 
cal subgraphs  can  be  merged  in  such  a  way  that  tests  are  not  added  to  the 
program. 

Merging  identical  subgraphs  might  expose  useless  unconditional 
tests,  so  T7  has  to  be  applied  again.  Again  useless  assignment  state- 
ments can  be  removed  by  Tl.   We  get  a  program  we  shall  call  IT  .   We  have 
so  far  used  T1,T3,T4,T5,T7  and  Til  to  transform  IT"  into  II  .  Next  we  have 
to  use  T10  and  again  T3  and  T^  to  obtain  the  optimal  program  n ' . 

T10  (  flipping  of  blocks  )  does  not  change  the  cost  of  the 
program,  it  even  does  not  change  the  graph  of  the  program.   So  we  might 
conclude  that  for  a  program  II,  there  might  be  several  optimal  programs 
all  having  the  same  cost  and  all  having  the  same  graph  and  all  are  equiva- 
lent by  T10  only.   Let  II1  be  one  of  them.   Further  applications  of  T10  to 
IT'  may  be  used  to  generate  other  optimal  programs. 

is  obtained  from  IT  by  applying  T3  and  T*+. 

The  following  part  of  the  proof  will  be  based  on  the  arguments 
of  Ullman  and  Aho  (l)  for  the  straight  line  code  case  which  apply  also 
here. 

Since  in  *T  no  variable  is  defined  twice,  the  steps  used  in 
going  from  II2  to  II '  can  be  reordered  so  that  all  applications  of  T^+  are 
performed  before  applying  T3.   So  the  program  II,  is  a  reordering  of  inde- 
pendent statements  of  n?,  Tig  wf]K   and  H'  is  renaming  of  the  variables  of 

The  step  in  going  from  IT  to  IT,  is  nondeterministic.   There  are 
only  finitely  many  possibilities  for  TT^  because  the  number  of  statements 
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is  not  changed,  but  to  avoid  an  exhaustive  search  for  the  optimal  n 
same  algorithm  (or  heuristics)  based  on  the  nature  of  the  specific 
cost  function  can  be  used  to  get  II  quicker. 

Since  renaming  does  not  change  the  cost-  H '  could  be  taken  to 
be  equal  to  II  ,  and  further  applications  of  T3  may  again  be  used  to  gen- 
erate other  optimal  programs. 

In  the  first  part  of  the  proof  of  Theorem  10,  we  assumed  that 
nonexecutable  paths  can  be  eliminated  so  that  statements  are  not  added 

to  the  program.   Therefore  in  the  optimal  program  each  path  is  executable. 

Assume  that  not  all  nonexecutable  paths  of  the  program  can  be 
eliminated  so  that  statements  are  not  added  to  the  program.   Therefore 
take  TI  to  be  a  reduced  program  equivalent  to  II  so  that  no  two  statements 

* v 

in  a  path  define  the  same  variable.   n  r      >  ,-   ,  ,-j*  PI,  • 

n  =   IT,  H1  =  IT  therefore  IT  =   IT. 

Let  ¥  and  ^ '  be  two  programs  equivalent  to  TL  and  II1  respec- 
tively, so  that  in  ^  and  xl"  each  path  is  executable.   By  Theorem  3 

ni  ?W        n'?M!f'i"- 

Let  ^  and  ^'  be  two  programs  equivalent  to  ¥  and  ^'  respectively 
so  that  corresponding  paths  in  ^.  and  ^'  are  consistent  and  have  the  same 

tests.   By  Th.  k 

*  1 1,5,7,8,11} '*l   rU,5,7,8,lir*i  ' 
^1  and  ¥'  are  reduced,  therefore  by  Th.  7  D. (¥  )  =  D.(¥')  and  by  Th.  5 

-x- 

i  i3,it,5,io,iir  i' 

Starting  with  the  program  II  we  operate  T6  and  Til  in  all  the 
cases  that  the  number  of  statements  added  is  not  greater  that  the  number 
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of  statements  eliminated.  The  rest  of  the  algorithm  is  the  same  as  in 

the  first  part  of  the  proof  of  Theorem  10.  We  get  a  program  II  , 

IT.  . ,  ■....,.,>""  .   This  completes  the  proof  of  Theorem  10. 
1  [1,6, 7, 10, 11 J   c 

We  presented  a  program  schema  that  models  loop-free  programs. 

We  found  a  complete  set  of  equivalence  preserving  transformations  and 

showed  that  this  set  is  also  irredundant.   In  this  chapter  we  provided 

a  procedure  for  optimization  using  the  set  of  transformations.  The 

results  obtained  will  be  extended  to  the  case  in  which  algebraic  laws 

are  assumed,  and  to  the  case  in  which  the  tests  are  Boolean  combinations 

of  elemental  predicates.   Also  we  will  show  that  the  results  obtained 

hold  for  a  broader  class  of  programs-  programs  which  always  halt. 
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6.   ALGEBRAIC  TRANSFORMATIONS 

Algebraic  laws  that  are  known  to  hold  among  operators  can  often 
be  used  to  expose  common  subexpressions  in  the  program.   In  the  follow- 
ing example,  the  commutative  and  the  associative  laws  of  addition  are 
used  to  expose  a  common  subexpression  which  can  later  be  eliminated,  so 
that  the  cost  of  the  program  will  be  reduced. 


EXAMPLE  Ik 


n  =  (G,  {A,B,C}  ,  {F,G}  ) 


G  -  -AC 


STOP 


acsoc  'a 

la,;    » 


commutative 
law 


X  -  +CB 
F  *-  +AX 


G  -  +AC 


STOP 


G  «-  -AC 


STO 


G  *-   +AC 


STOP 


n 


T2 


G  «-  +AC 

F  «-  +GB 


t(F) 


G  -  -AC 


STOP 


G  <-   +AC 


STOP 


G  «-  -AC 


STOP 


STOP 


IP 


The  common  subexpressions  above  could  not  be  exposed  using  the 
topological  transformations  only. 

In  this  chapter  we  use  a  model  of  program  schemata  in  which 
assignment  statements  are  of  the  form 


A  <-  ¥B.,...B 

1    r 
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A,B  ,  ...,B  e  Z  and  ^  is  an  r-ary  operator  from  a  countable  set  of  operators 

<f  over  the  domain  D. 

To  provide  an  actual  program  for  a  given  program  we  assign  values 
from  the  domain  to  the  input  variables  and  make  a  proper  assignment  of 
predicates  to  the  test  names  of  the  schema. 

The  model  assumes  that  a  set  of  algebraic  laws  F  holds  among 
operators  from  $. 

DEFINITION 

An  algebraic  identity  is  a  pair  (a, 3)  where  a,   3  are  well- formed 
expressions  over  <J>  and  Z. 

imples  of  algebraic  identities  are  the  associative  law  of 
addition  '+A+BC,  ++ABC)  and  the  distributive  law  (*A+BC,+*AB*AC) 

Let  E  and  E  be  two  well- formed  expressions,  and  y  =   (a,   3) 
is  an  algebraic  identity.   E  is  transformed  to  E^  by  y   iff  for  each 

appearing  in  a  or  P  there  is  a  well-formed  expression  E  such  that  if 
i'    and  3'  are  o<  and  P  with  E  substituted  for  each  instance  of  x,  then 
either  E±  =   b±   a1  52  and  E2  =  b±   3'  5g  or  E±  =   b±   p'  &2  and  E2  =  r' .     0. 

DEFINITION 

Let  E,  and  E^  be  two  well- formed  expressions  and  r  is  a  set  of 

12 

algebraic  laws.   E..  and  E  are  equivalent  under  F  ,(E  =  E  )  iff  E 
can  be  transformed  into  E  by  a  finite  sequence  of  laws  in  P  . 
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DEFINITION 

Let  IT  and  IT*  be  two  programs.   Let  i  and  k  be  two  executable 
paths  of  II  and  IT  respectively.  I   and  k  are  said  to  be  consistent 
under  r  iff  for  all  t 

t  is  in  the  sequences  of  statements  that  correspond  to  both  /  and  k, 

so  t'.at  in  pp.th  i   there  :5  s  a  statement  S.  or   the  form  t  'C  , . .  .  ,<Z  )   n  ,.n^, 

k  k 

and  in  path  k  there  is  a  statement  S.  of  the  form  t  (D.,...,D  )  m  ,m2  and 

I  k 

(ii)  v.  (C  )  =  v.  (D  )  for  all  q  ,  1  <  q  <  r,  that  is  the  expressions 
lqrJq.         -t-'_~l_ 

I  k 

v,-(C  )  and  v-(D  )  are  equivalent  under  r  for  all  q,  then  the  statement 
1  q       J  q 

prefixed  by  n..  is  included  in  path  I   iff  the  statement  prefixed  by  m  is 
included  in  path  k,  and  the  statement  prefixed  by  n~  is  included  in  path  £ 

iff  the  statement  prefixed  by  m.  is  included  in  path  k. 

For  an  aribtary  set  of  algebraic  identities  it  is  recursively 
unsolvable  to  determine  given  two  expressions  E  and  E  whether  E  =  E  , 
thus  it  is  undecidable  if  two  paths  are  consistent. 

In  Example  lk  above  the  two  left  paths  £,  and  k  of  II  and  H' 
respectively  are  consistent  under  a  set  of  algebraic  identities  T  that 
includes  the  associative  and  the  commutative  laws  of  addition. 

v  (F)  =  -i-A+BC 

kl 
v  (F)  =  ++ACB 

+A+BC  =  ++ACB 

DEFINITION 

Two  program  schemata  H  and  II '  will  be  called  equivalent  under  a 

set  of  algebraic  laws  I  (Ji   =;  II')  iff  for  all  consistent  pairs  of  paths 

11  k 

under  P  £,k,  for  each  expression  E  in  v  (II)  there  is  an  expression  IT 

k  i.         k 

in  v  (II')  such  that  E  =  E  ,  and  conversely. 


81 


Because  it  is  undecidable  for  an  arbitary  V   whether  two  expressions 
are  equivalent  under  V ,   the  question  of  equivalence  under  arbitary  r  for 
two  programs  II  and  II'  is  undecidable. 

Algebraic  laws  on  expressions  induce  transformations  on  dags. 
Thus  algebraic  transformations  on  programs  can  be  described  as  transforma- 
tions on  the  corresponding  dags. 

If  the  distributive  law  holds  for  expressions  a,   3,  y 

*a   +  37  =  +*a3*o7 
then  the  transformation  on  the  dag  is 


Dl  D2 

The  transformation  on  dags  induce  a  natural  transformation  on 
any  given  dag  of  the  program:  we  replace  each  instance  of  D.  with  Dp. 

Before  replacing  instances  of  dags  that  correspond  to  algebraic 
identities  in  a  dag  of  a  given  program,  we  must  make  sure  that 

1)  No  node  of  the  dag  replaced  (which  is  not  a  root)  is  distin- 
quished.  (otherwise  distinguished  nodes  associated  with  output  variables 
and  with  variables  referenced  by  tests  might  be  eliminated  by  the  trans- 
formation) . 

2)  No  node  of  the  dag  replaced  (which  is  not  the  root  or  a  leaf) 
has  ancestors  which  are  not  nodes  of  the  dag  replaced  in  all  the  dags  that 
include  the  node,  (because  nodes  might  be  eliminated  or  their  values  might 
be  changed  by  an  algebraic  transformation) . 
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3)   No  node  is  not  distinguished  and  has  no  ancestors  (because 
then  the  node  is  useless,  and  there  is  no  use  in  transformations  on 
useless  statements) . 

k)     Til  has  been  used  so  that  there  is  no  path  that  includes  a 
subexpression  of  the  expression  replaced,  but  not  the  expression  itself. 

DEFINITION 

We  will  call  an  open  program  a  program  in  which 

a)  no  output  variable  is  referenced  after  it  is  defined. 

b)  no  variable  not  in  I,  is  referenced  more  than  once  (in  all 
paths  that  include  the  definition  of  this  variable) . 

c)  Tl  can  not  be  applied. 

d)  Til  reverse  can  not  be  applied. 

e)  no  two  output  variables  have  the  same  value, 
a-d  will  guarantee  1-^  above. 

By  Lemma  8  below,  each  program  can  be  transformed  to  an  equivalent 
open  program.   Thus  if  I  is  an  algebraic  identity  and  IT  and  H'  are  two 
programs  in  which  corresponding  paths  are  consistent  under  I  we  say 

n  =H>n»    iff    d.  (n_)  ==>D.(n') 

I  i   0    I    i  0 

where  II  and  II'  are  open  programs  equivalent  to  II  and  H '  respectively,  and 
D.    are  all  the  dags  in  D(H  )  that  include  the  subtree  corresponding  to 
the  expression  transformed  by  I. 

Because  by  Theorem  5  dags  characterize  equivalence  classes 
under  the  transformations  T3,T^,T5,T10,T11,  the  definition  above  allows 
I  to  incorporate  all  these  transformations. 
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We  will  be  interested  in  algebraic  transformations  that  operate 
on  programs  in  which  each  path  is  executable,  because  there  is  no  use  to 
expose  common  subexpressions  in  nonexecutable  paths.  For  an  arbitrary- 
set  of  identities  r,  it  is  undecidable  if  a  path  is  executable  if  algebraic 
laws  are  assumed  to  hold  among  operators.   For  special  cases  of  algebraic 
identities  getting  rid  of  nonexecutable  paths  can  be  done  by  a  similar 
procedure  to  that  of  Theorem  3.   We  apply  a  transformation  similar  to 
T6  in  which  t'.-.c  left  path  is  removed  if  t(A) 

v(A)  =  v(B). 


'.-■■■■ 

If  II  is  a  program  in  which  each  path  is  executable,  there  is  an 
open  program  IT  such  that  II  =11  and  II  m  p  q  llT^O' 


:■?."  -  : 

First  we  reduce  II  so  that  useless  statements  will  be  eliminated 

and  no  two  output  variables  will  have  the  same  value.  We  get  a  program 

* 
II  ,  II  ,        JL  .   Now  we  operate  Til  reverse  as  many  times  as  necessary 
1    11,2,9* 11 J  1 

so  that  we  get  a  tree.  We  rename  II  such  that  no  variable  will  be  defined 

more  that  once  in  the  same  path  (T3)-  We  getllg,  1^  ."^Ilg  and  by 

* 
Lemma  1,11  n    ,-:        > II  .   Now  we  use  T2  reversed  (inserting  redundant 
1  (l,d,llj  d 

statements)  as  many  times  as  necessary  so  that  no  output  variable  will 
be  referenced  after  it  is  defined,  and  also  that  no  variable  will  be 
referenced  more  than  once.  We  get  an  open  programll0,  11^   T^T  n0  and 

"  (l,2*9,ll/V 


8U 


We  will  prove  an  equivalent  theorem  to  Th.  8  for  the  case  that 
algebraic  identities  are  assumed: 

THEOREM  11 

Let  II  and  IV  be  two  programs  in  which  each  path  is  executable. 
Let  r  be  a  set  of  algebraic  identities.  Then 

II  =  IT     iff     n  f1  ^    >   „  n   -n  -.,.  7=^TT. 

r  {1,2,6,7,9,10,11)  u  r 

When  algebraic  identities  are  assumed  to  hold  among  operators, 
the  topological  transformations  that  operate  on  tests  consider  identical 
tests  as  tests  checking  values  that  are  equivalent  under  F.  When  equiva- 
lence under  F  of  two  programs  is  decidable,  it  is  decidable  whether  the 
values  checked  by  tests  appearing  in  the  program  are  equivalent  under  F. 

If  IT  and  II'  are  two  programs  in  which  each  path  is  executable 

and  n  =  IT,  then  II  ■  g  *ft  .,=»¥ ,  IT  fl  c  *ft  n=#-^  such  that  in  Y  and  V 
r  {l,5,7,o,llJ       {1,5,7,«,11J 

corresponding  paths  are  consistent  under  r.   The  procedure  is  the  same 
as  in  Theorem  h,    only  here  tests  that  check  equivalent  expressions  under 
F  are  considered  to  be  identical. 

PROOF: 

1)  "if"  is  trivial  because  the  transformations  preserve 
equivalence . 

2)  Let  ¥  and  ^ '  be  two  programs  equivalent  to  IT  and  11' 
respectively,  in  which  corresponding  paths  are  consistent  under  r. 

11  {l,5,7,a,ur'-   n'  {l,5,7,8,llf,!" 
and  by  Le^as  3  and  k        K   a.g<6*7|U^  Y  H '  [1)g,6,7>11^ '  • 

II  =  II '   therefore   ¥  =  ¥* . 
Let  IIq  and  TIq  be  open  programs  for  ^  and  ^ '  respectively. 
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*  (1,2,9,11^0    ■;-.  .'.r.>  •  Vno' 


^tv(n0)=     >l;     =  < 


E  1     E  1  I  n 
*1  '   '  q  (   i- 


There  is  a  finite  sequence  of  algebraic  identities  that  operate 

on  each  set  {E.  }  ._.   1  <  j  <  q   and  convert  it  to  the  set  of  expressions 

k.  k. 

fE'.1}  .  n  where  {{E'.1}    }    ^       =   v(ll').   So  there  is  a  sequence  of 
w   j   i=l  j   i=l  j=l      0  ^ 

program  values  v, , . .  .,v  such  that  v.    =   v(n.)  v  =  v(TI')  and  v.  ..  is 
^  1     r  1      0    r  l+l 

created  from  v.  by  applying  one  identity  in  r  to  one  or  more  expressions 

in  v.  . .   An  identity  operates  simultaneously  on  several  expressions  that 
are  the  values  of  the  same  output  variable  in  all  the  paths  of  the  program. 

We  will  show  that  if  P  is  an  open  program  and  v(p)  =  v.,  then 
there  is  a  program  P1  such  that  v(p')  =  v.  1  and  P  ,  ?  s  =^ ■   '. 

Let  I  be  the  identity  applied  in  going  from  v.  to  v.  ,  •   Since 
Ls  open,  I  is  applicable  to  the  dags  in  D(  •:>) .  After  operating  on  all 
the  dags  D.(p)  that  include  the  subtree  corresponding  to  the  expression 
transformed  by  I,  we  get  a  new  se     iags  D..  whose  value  is  v.   .   Let 
p'  be  such  that  D(o')  =  D  .  Thus  P"*p'.   Also  because  by  Theorem  5  dags 

characterize  equivalence  classes  under  T3 , Th , T5 , T10 , Til  P  . .  , ,  0    .>>"'. 
(We  used  Lemmas  1,2,3). 

Therefore  II  fl.     >  ,  ..  ,„''H' . 

(1,2,6,7,9,10,11)  ur 

The  following  theorem  will  give  a  procedure  for  optimizing 
programs  under  operator  and  operand  preserving  algebraic  identities. 

DEFINITION 

An  algebraic  identity  is  operator  preserving  if  the  number  of 
operators  on  both  sides  of  the  identity  is  the  same. 


/ 


The  identity  --ABC  ■  -A+BC  is  operator  preserving.  Under  an 
operator  preserving  identity  the  number      ■  Lgnment  statements  is 
preserved. 

DEFINITION 

An  algebraic  identity  is  operand  preserving  if  each  operand  either 
i)   appears  exactly  once  on  each  side  of  the  identity,  or 
ii)  all  its  appearances  on  a  side  of  the  identity  follow  the  same  instance 
of  an  operator. 

Under  an  operand  preserving  identity,  openness  of  programs  is 
preserved. 

The  definition  of  operand  preserving  identity  includes  that 
of  Aho  and  Ullman  (l) . 

**XXY  =  *Y*XX  is  an  operand  and  operator  preserving  identity. 

THEOREM  12 

Let  P  be  a  set  of  operator  and  operand  preserving  identities. 

There  is  an  algorithm  that  finds  for  every  program  IT  an  optimal 
program  TI '  such  that  II  -p  H',  The  algorithm  operates  in  a  series  of  steps 
so  that  II  is  first  transformed  to  an  equivalent  open  program  II  .  Using 


algebraic  transformations  only,  II  is  then  nondeterminis tic ally  transformed ! 
to  II' .  TI'  is  obtained  by  operating  the  topological  transformations 
{T1,T2,T3,T^,T5,T7,T8,T10,T11)  on  II  . 

PROOF : 

Let  TI  be  an  open  program  equivalent  to  II  in  which  each  path  is 
executable.  H  can  be  obtained  by  first  eliminating  nonexecutable  paths 
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by  a  similar  procedure  to  that  of  Theorem  3,  and  then  costructing  an  open 
program  by  using  T1,T2,T9,T11  (Lemma  8). 

Since  II  is  open  and  the  algebraic  identities  in  r  preserve 
openness,  the  algebraic  transformations  do  not  affect  the  relative  position 
of  branch  nodes  of  the  dags  that  are  not  replaced  by  the  algebraic  trans- 
formations. Thus  all  the  algebraic  transformations  can  be  applied  before 
applying  other  transformations  that  operate  on  assignment  statements.   It 
is  also  clear  that  algebraic  transformations  can  be  applied  before  applying 
transformations  that  operate  on  tests,  because  they  change  only  assign- 
ment statements.   Thus  we  can  transform  H  by  applying  algebraic  trans- 
formations only,  into  a  program  II'  such  that  II'  =11'. 

Operator  and  operand  preserving  identities  preserve  openness, 
so  that  the  algebraic  transformations  can  be  applied  sequentially,  without 
adding  statements  for  making  the  obtained  programs  open.   Because  the 
number  of  statements  is  preserved,  II'  can  be  found  in  a  finite  number 
of  steps,  although  a  heuri.         dure  might  help  to  find  II'  quicker. 
Now  we  reduce  II'  and  apply  T3  bo  rename  variables,  such  that  no  two  state- 
ments in  a  path  will  define  the  same  variable.  We  obtain  a  program  II  . 
The  rest  of  the  proof  is  exactly  as  in  Theorem  10. 

There  are  two  generally  recognized  ways  to  optimize  code  generated 
by  a  compiler:  global  optimization  which  is  concerned  with  the  whole  program, 
and  local  optimization  which  depends  only  on  information  in  a  single  ex- 
pression or  statement.  Theorems  10 and  12  provided  schemes  for  global 
optimization  without  algebraic  identities,  and  with  certain  types  of 
identities,  respective 
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Seme  of  the  local  optimization  techniques  presented  in  litera- 
ture Baywell  (3)  Lowry  and  Medlock  ^7)  could  be  viewed  as  algebraic  iden- 
tities that  hold  among  operators,  operands  and  constants.   Examples 
algebraic  identities  that  induce  local  optimization  of  code  are 

X  *  2  =  X  +  X 

X  **  2  =  X  *  X 

X  **  \   =  SQRT  (X) 

A  **  (-C)  = 


A**C 
A  **   2.  =  A  **  2 

X  *  1  =  X 

X  +  0  =  X 

In  these  cases  the  object  of  optimizations  under  algebraic 
identities  is  not  to  expose  common  subexpressions  as  before,  but  to 
locally  improve  the  machine  code  generated.   Improvements  are  usually 
done  by  replacing  some  operators  with  operators  that  are  known  to  be 
more  efficient  in  these  special  cases. 

In  the  case  of  global  optimization,  a  cost  was  reasonable 
provided  that  the  cost  decreased  if  a  statement  was  deleted  from  some 
executable  sequence  of  the  program.   In  the  cases  of  local  optimization, 
a  cost  criterion  is  called  reasonable  provided  that  the  cost  decreases 
if  a  subsequence  of  statements  in  some  executable  sequence  of  the  program 
is  replaced  by  another  subsequence  of  statements  which  is  not  longer 
and  which  runs  faster. 

We  will  extend  the  procedure  in  Theorem  12  to  apply  to  certain 
types  of  algebraic  identities  that  induce  local  optimization  of  code. 
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Because  these  identities  operate  on  constants  in  addition  to 
operands  and  operators,  we  will  extend  the  definition  of  a  program  schema 
to  include  constants.  The  extension  does  not  pose  any  difficulties  and 
will  be  described  informally. 

~\  <$> .  T  will  be  as  before  and  C  will  be  a  countable  set  of 
constants. 
assignment  statements  will  be  of  the  type 

k        A  -  0Bn  . . .  B 

1    r 

A  €  S,  Bn  , . .  •  ,B  e  E  U  C1     and  3i  1  <  i  <  r,  B.  e  E 
1      r  —   —     l 

;est  statements  will  be  of  the  type 

t(C^,  ...,C^)   k.., k 

C,,...,C  €  E  U  C   and   3i  l^i^r,  C.  eE 
1*   '  r  '  i 

A  program  schema  will  be  a  quadruple  (P,  I,  U,  C)  where  P,  I,  U 
are  as  before,  and  C  is  a  finite  set  of  constants. 

Program  equivalence  will  be  defined  as  before. 

The  graphical  representation  of  programs  as  dags  will  be  as 
before,  and  a  leaf  will  be  created  for  each  ceC  in  all  the  dags  of  the 
progr-. 

The  definitions  of  operand  and  operator  preserving  identities 
will  be  extended  to  include  constants.  The  restrictions  on  operators 
and  operands  are  as  before  but  constants  can  appear  anywhere  on  the  two 
sides  of  the  identity.  By  the  new  definition  the  identities 

X  *  2  =  X  + 

X  *  X 

X  **  |  =  SQRT(X) 


A  **(-C) 
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A  **  C 
A  **  2.  -  A  **  2 

are  all  operand  and  operator  preserving. 

A+X*1=A+X   and 

A+X  +  0  =  A+X 
are  not,  because  they  are  not  operator  preserving. 

Under  the  new  definition  the  number  of  statements  is  preserved, 
and  also  the  openness  of  the  programs  is  preserved  because  constants  appear 

as  leaves  in  the  dags  of  the  program.  Therefore  the  scheme  for  optimiza- 
tion provided  by  Theorem  12  applies  also  to  the  extended  case  of  operator 
and  operand  preserving  algebraic  identities.   The  cost  function  is  reason- 
able also  in  the  sense  that  the  cost  decreases  if  local  improvements  are 
made.   Thus  Theorem  12  provides  a  scheme  for  global  and  local  optimiza- 
tion where  certain  types  of  algebraic  identities  are  assumed. 
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7.   LOGICAL  TRANSFORMATIONS 

We  considered  a  program  schema  that  contained  a  countable  set  T 
of  test  names.   The  elements  of  T  will  be  called  elementary  tests. 

:  :  •  :: 

A  test  is  either  an 

(1)  elementary  test,  or 

(2)  an  expression  of  the  form  pvr  where  p,  r  are  tests,  or 

(3)  an  expression  of  the  form  p.r  where  p,  r  are  tests,  or 
(h)     an  expression  of  the  form  p  where  p  is  a  test. 

The  tests  with  the  binary  operations  >•  and  .  and  with  the  set  B 
consisting  of  the  two  constants  T,  F  can  be  shown  to  be  the  algebra  of 
Boolean  functions  of  the  elementary  tests. 

The  definition  of  a  program  schema  is  extended  to  include  test 
statements  that  have  both  elementary  tests  and  tests. 

An  interpretation  is  defined  as  before. 

The  definitions  of  executable  and  nonexecutabl  i_  paths  are  as  in 
the  case  of  program  schemata  that  have  elementary  tests  only. 

Boolean  manipulations  on  the  tests  may  reduce  the  cost  of  the 
program.   In  the  following  example  we  apply  Boolean  manipulations  to 
expose  nonexecutable  paths  that  can  later  be  eliminated  so  that  the  cost 
of  the  program  is  reduced. 
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EXAMPLE  15 

n 

= 

(o, 

(A, 

B] 

,{D}0 

ts/-t 

, 

B) 

HA2 

'•  •'■  AS 


EK^AB 


IV0CR 


t2(A,B) 


PCB 
STOP     ^   r^) 
EH-eCB  y'       \D-CpCB 


STOP 


STOP 


The  path  that  includes  the  tests  t  (A,B)  and  the  left  branch 
of  ~t , (A)  and  the  path  that  includes  t  (A,B),  t  (A,B)  and  the  right  branch 
of  ~t  (A)  are  nonexecutable.  By  eliminating  them  we  get 


^(A.B) 


IT    = 


(XPAB. 
D'^PCB, 

STOP 


t0(A,B) 

C<-<PAB  /       \  D+^AB 
I>SCBy 


STOP 


The  logical  transformations  exposed  nonexecutable  paths  of  the 
program,  also  tests  were  eliminated  from  some  executable  sequences  of  the 
program.   Thus  logical  transformations,  in  addition  to  topological  and 
algebraic  transformations,  might  reduce  the  cost  of  programs  that  have 
tests. 

Optimizing  compilers  that  use  logical  transformations  have  been 
built  (see  (5)).   In  FORTRAN,  logical  transformations  may  simplify  the 
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logical  expressions  in  IF  statements.   See  the  reference  above  and  also 
Lowry  and  Medlock  ( 7) • 

The  cost  of  the  program  can  also  be  reduced  by  simplifying  the 
Boolean  expressions: 

P*?  (A)  p.r  (A) 


C  - 


«-¥A 


C  -  cpAA 


STOP 


C  «-¥A 


STnP 


Two  identical  tests  checking  different  values  are  considered  to 
be  different  variables  in  our  Boolean  algebra: 


EXAMPLE  16 


A  -  CPXY 


.  TO] 


C  - 

a  -  ecx 


STOP 


A  -  cpXY 


9k 

The  value  of  the  variable  A  checked  by  the  first  p(A)  test  is  9XY  and 
by  the  second  p(A)  test  is  d^XYX.      Therefore  the  marked  path  can  not  be 
eliminated. 

We  will  define  the  notion  of  equivalence  of  loop-free  programs 
that  have  tests  in  addition  to  elementary  tests. 

DEFINITION 

Let  II  be  a  loop-free  program  that  has  tests.   With  each  executable 

I. 
path  I.    of  the  program  we  will  associate  a  path  condition  ¥  '  which  is  the 

condition  that  path  I.    is  executed,  and  is  a  Boolean  function  of  the  tests. 

Identical  tests  checking  different  values  are  considered  as 

different  variables  of  the  Boolean  function. 

DEFINITION 

Let  II  and  IT '  be  two  programs  that  have  tests  and  i   and  k  are 

two  executable  paths  of  IT  and  IT'  respectively.  I   and  k  are  said  to  be 

I  k 

consistent  iff  ^  and  ¥  are  Boolean  functions  of  the  elementary  tests 

t.,,t_,...,t  ,  there  is  some  assignment  of  truth  values  a.,..., a  to  the 
1  2'     m  1      m 

I  k 

variables  tn , t^,...,t  which  makes  both  ^  and  ^  true. 

12      m 

Notice  that  the  definition  of  consistent  paths  for  programs 

that  have  elementary  tests  only,  is  a  special  case  of  this  definition  where 

I  k 

¥  and  ^  are  products  of  the  elementary  tests  appearing  in  the  paths. 

DEFINITION 

Let  II  and  II'  be  two  loop-free  programs  that  have  tests.  II  and 
11'  will  be  called  equivalent  iff  for  all  interpretations  In  (of  input 


values,  elementary  tests  and  functions)  Val   (IT)  =  Val  (IT'). 

In        In 
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THEOREM  13 

Let  n  and  IT'  be  two  programs  that  have  tests.  II  ==  IT'  iff  for 

I  k 

all  consistent  pairs  of  paths  /,  k,  v  (H)  =  v  (IT  . 

PROOF: 

The  proof  is  similar  to  that  of  Theorem  2. 

For  l)  we  take  the  same  interpretation  In  of  inputs  and 
functions  with  the  following  interpretation  for  elementary  tests: 

For  all  elementary  tests  t  in  path  j=i,k 

±)   determine  In( tJ(aL  , . . . ,a  ))  for  (a,,..., a  )  such  that  a.   =   vJ(X.) 

1     n      v  j.'   '  n  nr  i' 

1  <   i  <  n  and  t  (X  ,  ...,X  )  appears  in  the  test  statement  S^,in  such  a  way 

I  k 

that  both  ¥  and  ^  are  true. 

ii)  In(tJ (a  , . . . ,x   ) )  is  arbitrary  for  other  n-tuples  (a  , . . . ,a   ) . 

i  and  k  are  consistent  therefore  there  is  an  assignment  of  the 
elementary  tests  which  makes  both  ^  and  ^  true.  Under  this  interpreta- 
tion paths  £  and  k  are  executed  in  II  and  II'  respectively. 

The  rest  of  1)  and  also  2)  are  the  same  as  in  Theorem  2. 

EXAMPLE  17 

Let  II  and  II'  be  of  Example  15.   The  pairs  of  paths  (i  ,k  ), 
(i^,k  ),  (£  ,k  )  are  consistent  and  their  values  are  identical. 

*  1  -   (tx  .  t2)  l±  -   t2tx 

*2   - 
¥   -  t  t 

*  "  tlt2 

II  k2 

¥  "  =  ¥    thus  there  is  an  assignment  of  truth  values  to  t.. 

-  fl      "-2 
and  t  that  makes  both    "  and  ^   true.   Therefore  £,  and  kp  are  consistent. 

*1  k2 

v   "    9<PABI   v  (IT) 
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f  -  -  (^  v  t2)  €1=  tx  v  txt2  =t. 
^  X  =  t 

thus    ¥  *  =  *  X 

'?  kl 

v  (n)  =  WABB  =  v  (IT) 


*   "  tl  ~  *2 
k3   -   - 

I  k 

thus     ¥  ^  *  ¥  ^. 

£  k 

v  3(n)  ■--  ^ab  =  v  3(n') 

In  II  and  H'  consistent  paths  have  the  same  value,  therefore  by  Theorem  13 

n  ■  it. 


EXAMPLE  18 


The  following  logical  transformations  preserve  equivalence. 


i) 


ii) 
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iii) 


h 


-__ :_ 

t  k 

In  i)  l  "  =  t_  v  t  ,^  =  t  .  There  is  an  assignment  of  truth 

ll  kl 

values  to  t  and  t  which  makes  both  ¥  '"  and  ^   true,   (take  t.=T,  t  =F). 


Thus  /  and  k  are  consistent.  The  values  of  I     and  k  are  identical. 

*    =  tlt2"  The  assignmen'fc  t1=F»  t2=T  makes  both  ^  '  and  ^   true, 
therefore  k  and  I     are  consistent.   The  values  of  kp  and  i  are  identical. 


¥  2 


1 
identical 


Thus  fp  and  k  are  consistent.   Their  values  are 


Therefore  L  preserves  equivalence. 

ll  kl 

ii)  ¥  "  =  t  t  =  ¥  .   The  values  of  I     and  k..  are  identical. 

«2   k2 

¥   -  t  t    ¥   -  t  t 

f2      k2 

The  assignment  t  =T,  t  =F  makes  both  ¥  '  and  ¥  '  true,  therefore  f  and  k 

are  consistent.  The  values  of  I     and  k  are  identical. 

k3   " 
f    *1 


Y   2   =   ^  V,   t2 


S 


The  assignment  t-=F,  tp=F  make      ¥  '"  and  4f  ^  true,  therefore  lp  and 
k_  are  consistent.  Their  values  are  identical. 

Therefore  Iv,  preserves  equivalence. 
iii)  The  pairs  (I   ,k^),  ( lp,   k  )  are  consistent  and  their  values  are  the 
same,  therefore  L,  preserves  equivalence. 
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Any  Boolean  manipulation  on  the  tests  that  preserve  equivalence 
will  be  called  a  logical  transformation  on  the  program. 

We  will  prove  an  equivalent  theorem  to  Theorem  8,  for  the  case 
that  logical  transformations  are  included: 

THEOREM  Ik 

Let  11  and  IT1  be  two  loop- free  programs  that  have  tests.  Then 

nSlT'   lff   E  {l,2,6,7,9,10,ll}ULVn' 
where  L  is  a  set  of  logical  transformations. 

PROOF : 

1)  "if"  is  trivial  because  logical  transformations  and  top- 
ological transformations  preserve  equivalence  of  programs  . 

2)  For  every  loop-free  program  with  tests  there  is  an  equiva- 
lent loop- free  program  with  elementary  tests  only,  which  is  obtained  by 
applying  logical  transformations.   We  will  show  that  by  induction  on  the 
structure  of  a  test. 

If  a  test  is  of  the  form  ps/r  then 

.1 

/       \s      <,   y  s~  / 

r 

sr 

i  r 

where  S  and  S  are  the  left  and  right  subtrees,  respectively.   fWe 

assume  that  the  graph  of  the  given  program  is  a  tree.   This  assumption 

does  not  cause  any  difficulties  since  by  repeated  applications  of  Til 

in  reverse  any  program  can  be  transformed  to  an  equivalent  program  whose 

graph  is  a  tree) . 
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If  a  test  is  of  the  form  p.r  then 
p.r 


^=^ 


If  a  test  is  of  the  form  p  then 


*=^> 


These  transformations  preserve  equivalence  (see  Example  18) 


EXAMPLE: 


=   '  and  ['  has  elementary  tests  only. 
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Let  II.  and  IT'  be  loop- free  programs  that  have  elementary  tests 
y,  equivalent  to  IT  and  IT'  respectively. 

Til  is  needed  to  transform  the  graph  of  IT  to  a  tree. 

"'  In)  ul*H      n'3ni- 

L  and  L  are  all  the  logical  transformations  required  to  transform  II  to 

II. ,  and  IT  to  II'   respectively.   It  follows  that  II  s  II' .  II  and  IT' 

are  loop- free  programs  with  elementary  tests  only.   Therefore  by  Theorem  8 

ni  (i,2,6,7,9,io,llf  Ki ' 
n  (1,2,6,7,9,10,11)1^  UL' 


Thus      I  fno^^^-.^-m..T  in;  >  n' 


where  L'  are  all  the  logical  transformations  required  to  transform  IT'  to 
n '  .   Let  L  =  L  U  L'  .  Thus 

IT  n  a,  ■/  v  ^  '-^  -,  -,  1,  |T  >IT '   where  L  is  a  set  of  logical  trans- 
ll,2,6,7,9,10,ll}UL 

formations. 
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8.   OPTIMIZATION  UNDER  LOGICAL  TRANSFORMATIONS 

The  object  of  optimization  under  logical  transformations  is 
i)  to  eliminate  nonexecutable  paths, 
ii)  to  minimize  the  number  of  tests  performed. 

Example  15  above  showed  how  logical  transformations  may  reduce 
the  cost  of  the  program  by  eliminating  nonexecutable  paths.   Minimizing 
the  number  of  tests  performed  is  done  by  (a)  simplifying  the  Boolean 
expressions  of  the  tests  and  (b)  by  breaking  down  the  logical  expressions 
so  that  only  the  necessary  minimum  number  of  tests  are  performed. 


EXAMPLE  19 

The  program  IT  has  the  following 

l  r 

lec     re  S  and  S~  arc  the  left  and  right 

subtrees,  respective!  . 

The  logical  expression  is  completely  determined  if  t  is  true, 

and  there  is  no  need  in  this  case  to  perform  the  tests  t0  and  t  .  Also, 

2      3 

if  t1  is  false  and  one  of  tg,  t,  is  false,  there  is  no  need  to  perform 

other  teat.   Time  'jy  transforming  H.  to  n_        / 

12        s 


we  reduce  the  number  of  tests  performed  in  the  following  cases: 
l)  t-,  is  true 


L02 


2)  t..  is  false  and  t  is  false. 
In  all  the  other  cases,  the  cost  of  the  program  is  unchanged. 


If  TT,  has  the  test  - 
we  first  have  to  simplify  the 
Boolean  expression  by  de  Morgan's 
law 


\(t2JZT) 


and  now  we  apply  the  arguments  above  and  break  doT.-n  the  tests.   We  get 

t, 


Compilers  that  break  down  logical  expressions  in  order  to 
accelerate  the  execution  of  logical  statements  have  been  constructed. 
See  Lowry  and  Medlock  (7)  and  Huskey  and  Wattenberg  (k)> 

The  definition  of  a  reasonable  cost  for  programs  with 
elementary  tests  is  changed  to  include  (ii)  above. 
DEFINITION 

A  cost  function  on  programs  that  have  tests  is  reasonable 
provided  that 


103 


1)  the  cost  decreases  if  statements  which  are  never  executed  are 
deleted  from  the  program  in  such  a  way  that  statements  are  not  added 
to  the  program. 

2)  the  cost  decreases  if  a  statement  is  deleted  from  some  executable 
sequence  of  the  program. 

3)  the  cost  decreases  if  identical  subgraphs  and  identical  statements 
are  merged  in  such  a  way  that  tests  are  not  added  to  the  program. 

k)     the  cost  decreases  if  Boolean  expressions  of  tests  are  broken  down 
so  that  the  number  of  elementary  tests  performed  is  reduced. 

Boolean  Algebra  theorems  which  are  useful  in  simplifying 
Boolean  expressions  can  be  used  to  eliminate  nonexecutable  paths. 
Examples : 
1.   X-'XY  =  X  is  equivalent  to 


2.   XY^XY  =  Y  is  equivalent  to  the  tranc format ion 

X  X 

T7,T11 


10U 


In  the  following  stages  of  the  optimization  the  topological 
transformations  Til  and  T7  will  transform  the  tree  on  the  right  to 
a  tree  equivalent  to  the  expression  Y. 
3.   (XvY)Y  =  XY   is  equivalent  to  the  transformation 

X  X 


k.      XYvXZ^YZ  =  XYvXZ  is  equivalent  to 
X 


The  transformations  in  l)  -  k)   do  not  add  statements  to  the 
program. 

The  following  theorem  will  give  a  procedure  for  optimizing 
programs  that  have  tests  under  reasonable  cost  functions. 
THEOREM  13 

There  is  an  algorithm  that  finds  an  optimal  program  II '  for 
a  given  program  II  that  has  tests.   The  algorithm  operates  in  a  series 
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of  steps  so  that  FI  is  first  transformed  by  logical  transformations 
to  an  equivalent  program  nn  that  has  elementary  tests  and  negations 
only.   Using  logical  transformations  that  consist  of  negations  only, 
TL.  is  then  transformed  to  a  program  TT'.   IT'  is  obtained  by  operating 
the  topological  transformations  on  II '  . 
PROOF: 

We  first  assume  that  all  nonexecutable  paths  can  be 
eliminated  from  IT  in  such  a  way  that  statements  are  not  added  to  the 
program.   Thus  in  every  optimal  program  each  path  is  executable.   Take 
IT,  to  be  a  program  logically  equivalent  to  TT  in  which  each  path  is 
executable. 

Li 

where  L-,  are  the  logical  transformations  necessary  to  transform  II  into 

TT,  which  are  similar  to  T6  but  operate  also  on  negations  of  tests. 

By  the  definition  of  reasonable  cost  functions  every  optimal 

program  has  elementary  tests  and  negations  of  elementary  tests  only. 

That  is  because  the  number  of  tests  performed  is  minimized  if  logical 

expressions  are  broken  down  into  tests  that  do  not  have  the  v  and  • 

operations.   Take  TL.  to  be  a  program  logically  equivalent  to  II  which 

has  minimum  number  of  tests  performed.   TI  is  obtained  by  breaking 

the  logical  expressions  in  TT-j_  into  elementary  tests  and  negations  of 

elementary  tests.   In  some  cases  de  Morgan's  laws  must  be  used  so  that 

breaking  the  logical  functions  would  be  possible.   TT-,     >  IL  where 

L0 
L„  is  the  set  of  logical  transformations  necessary  to  transform  TT, 

into  II  . 
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There  may  be  several  programs  equivalent  to  [L  that  have 

elementary  tests  and  negations  of  elementary  tests  only.   For  example 

is  equivalent  to  but  also  to 

tjytp 


Thus  the  process  of  constructing  TL,  is  nondeterministic.   Only  a  finite 

number  of  ],     can  be  reached,  because  breaking  the  expressions  can  be 

done  in  a  finite  number  of  forms  only. 

In  the  above  example  the  two  possible  trees  have  the  same 

cost  provided  there  is  no  additional  information  on  the  tests.   But  for 

some  cases  if  we  permute  the  terms  we  might  reduce  the  cost  of  the 

program.   For  example  t-,  (t-^t,)  and  (t_yt,)t,  are  the  same  functions 

J-  2  9       d     $     y 

according  to  the  definition  of  Boolean  algebra  but  the  corresponding 
trees        ^  t. 


wv 


(vV'i 


are  different  and  their  costs  are  not  the  same.   (The  average  number  of 
tests  performed  by  the  program  on  the  right  is  greater. ) 
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In  order  to  eliminate  checking  all  possible  programs  with 
elementary  tests  equivalent  to  the  given  program,  we  would  like  to 
have  an  algorithm  that  given  a  Boolean  expression  will  find  the  minimal 
tree  corresponding  to  it.   This  problem  is  a  special  case  of  the  problem 
presented  in  Slagle  (10).   That  paper  is  concerned  with  finding  a 
minimum-cost  tree  equivalent  to  the  given  Boolean  expression  where  the 
cost  of  applying  each  elementary  test  and  the  probability  of  its 
outcome  are  given.   When  no  information  is  given  on  the  tests  we  may 
assume  that  the  costs  of  the  tests  are  equal  and  the  probability  of 
each  test  to  be  true  is  equal  to  its  probability  to  be  false.   Thus  we 
get  a  special  case  of  the  general  problem. 

The  algorithm  finds  a  low  cost  tree  for  the  given  expression 
in  the  following  way: 

Suppose  an  expression  S  is  of  the  form  S=S2^.  ..^S  .  Each 
permutation  of  the  S-  gives  a  different  tree.  To  obtain  a  low  cost 
expression,  expressions  S.  that  have  low  cost  and  high  probability  of 

J 

being  true  should  be  carried  out  first.   We  will  distinguish  between 
the  following  cases : 

a)  for  a  variable  x.  the  expression  that  gives  a  low  cost  tree  is  x. 
itself. 

b)  if  an  expression  is  of  the  form  S,^. . .^S  and  none  of  S .  is  a 

1      n  j 

disjunction  then  find  for  each  S.  a  low  cost  expression  R.  and  the  low 

J  J 

cost  expression  for  S  will  be  a  permutation  of  the  R.  R  v. . .^R  such 

that 

c.  c 

J:  <  <  -£ 

pl  " "  Pn 

where  c.  is  the  cost  of  R^  and  j>*   is  its  probability  to  be  true. 
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c)  if  the  expression  is  of  the  form  G S....S  and  none  of  the 

S .  is  a  conjunction  then  find  for  each  S .  a  low  cost  expression  R .  and 
J  J  J 

then  permute  the  R.  and  get  the  expression  R, R  such  that 

c,  c 

-1  < <  -a 

where  c.  is  the  cost  of  R.  and  q.=l-p.,  p.  is  its  probability  to  be 
J  J      J    J   J 

true. 

Suppose  we  wish  to  find  a  low  cost  tree  for  x,^x.(x_^x,  )  and 

J-  2  3  4- 

c.,p.  are  as  follows:   c.  =(176,80,1*0,1+2),  Pi=(|.  \'\>   |)'   We  first 

find  low  cost  expressions  for  x,  and  x  (x_^x,  ).   For  x,  we  get  x., 

i      2   3  h-         i         i 

itself.   For  x  (x _^x.  )  we  find  low  cost  expressions  for  x  and  x _^x.  . 

For  x^x,  we  have  1*2  ^   1*0  therefore  the  low  cost  expression  is  x.^x,. 
3  h  T   "T  * 

2    k 
Its  cost  is 


cA+(Vc3)V3+(Vc3Hq3=cU+qUc3=b2" 

5 
Its  probability  is  p,+q,p  =^-. 


The  low  cost  expression  for  x  (x_vx.  )  is  x  (x.  ^x  )  because 


c(x2)   _  go  ^  c(x^3)   _  £2 

q(x  )      1     q(x,^x  )      3 
2        ^   ^       8 


The  cost  of 


is  111  and  its  probability  is  -=^- 
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The  low  cost  expression  for  x.^oc  (x  ^x,  )  is  x.vx  (x.vx  ) 

17§  <  111 
because   ^^  <-  ^^  • 

2        iS 

If  no  information  is  given  on  the  tests,  i.e.  we  assume 
c.  =1  P-=p  for  all  I*   then  the  low  cost  expression  for  x  (x  vx.  ) 
can  be  either  x  (x  vx  )  or  x  (x,vx  )  but  not  (x  vx,  )x  because 

ju  c,(Xg?  <  ct(yv  _  xi 

1  q(x   )  ajx~3x77 

2  d  ^     4  U 

For  the  case  of  disjunctions  of  conjunctions  of  singly 
occuring  variables  and  for  the  case  of  conjunctions  of  disjunctions 
of  singly  occuring  variables  the  algorithm  above  is  proved  to  find  a 
minimum  cost  tree.  Although  this  fact  cannot  be  extended  to  the 
general  case,  an  exhaustive  search  in  the  general  case  may  be  made 
more  efficient  by  using  the  above  algorithm. 

Several  improvements  can  be  made  to  Slagle's  algorithm 
concerning  extending  his  results  to  variables  appearing  more  than  once 

i)   as  we  showed  above  some  cases  of  variables  occuring  more 
than  once  can  be  eliminated  by  getting  rid  of  nonexecutable  paths. 
This  holds  for 

A-,  •  •  •  A   A_  •  •  •  A  I  _  •  •  .  x   =  A ,  •  •  •  A 

1    nl    nl    m    1    n 

we  eliminated  the  double  occurance  of  X  ...X  .   This  is  done  by 

eliminating  nonexecutable  paths  of  the  program.   This  is  also  true  for 

X,  ...X  (X,...X  -Y,  ...Y  )  =  X,...X 
1    n  1    n  1    m     1    n 

XY, . . .  Y  ^XY, .  . .  Y  =  Y, . . .  Y„ 

1    n   1    n    1    n 
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ii)   the  algorithm  applies  also  to  those  cases  of 
disjunctions  of  conjunctions  of  variables  appearing  more  than  once 
that  can  be  expressed  as  conjunctions  of  disjunctions.   (e.g. 

We  might  conclude  that  the  algorithm  above  might  be  used  to 
get  ITn  quicker,  either  in  its  general  form  when  the  cost  of  applying 
each  elementary  test  and  the  probability  of  its  outcome  are  given,  or 
in  the  special  case  of  the  algorithm  when  no  information  is  given. 


!! 


L1UL0 


»Mr 


n  a  il 


We  define  L  =  L  U  L 


IL  and  TT'  both  have  elementary  tests  and  negations  only,  and 
are  equivalent  by  a  set  of  transformations  composed  of  topological 
transformations  and  logical  transformations  that  consist  of  negations 
only.   Logical  transformations  that  include  negations  might  expose 
identical  subgraphs.   Example: 


t(A) 


Also  logical  transformations  that  include  negations  only  might  make 
flipping  of  tests  possible.   Therefore  we  will  transform  n  to  a 
program  II'  equivalent  to  II  by  logical  transformations  only,  such  that 
II'  is  equivalent  to  H'  by  topological  transformations  only.   If  IT  is 
not  known  all  choices  for  H'  must  be  considered,  thus  II'  is  constructed 
nondeterministically.   There  is  only  a  finite  number  of  possible  H' 


Ill 


but  some  algorithm  or  heuristics  based  on  the  nature  of  the  cost 
function  might  get  IT'  quicker.   (In  some  cases  all  TI'  have  the  same 
cost,  but  the  costs  might  be  different  for  some  cost  functions.   For 
example,  the  cost  of  the  test  for  zero  might  be  less  than  the  cost  of 
the  test  for  a  non-zero  variable. )  A  good  algorithm  will  construct 
TIq  such  that  tests  checking  the  same  values  will  have  the  same  logical 
value,  so  that  applications  of  T8  and  Til  (if  needed)  might  be  possible. 

n '  > 

;  (l,2,3A, 5,7,8, 9,10, 11) 

The  algorithm  to  obtain  an  optimal  program  IT1  from  1'  by- 
using  topological  transformations  only  is  the  same  as  in  Theorem  10. 

Since  we  showed  that  breaking  down  the  tests  into  elementary 
tests  reduces  the  cost  of  the  program,  a  different  algorithm  from  that 
of  Theorem  15  could  be  designed  which  first  breaks  down  the  tests  as 
to  get  some  equivalent  program  with  elementary  tests  only,  and  then 
operates  the  topological  transformations  in  order  to  get  an  optimal 
program.   The  algorithm  of  Theorem  10  cannot  be  used  here  because  it 
does  not  find  a  program  with  the  minimum  number  of  elementary  tests. 
Obtaining  an  optimal  program  with  the  minimum  number  of  tests  by  using 
topological  transformations  only  is  more  complicated  than  Slagle's 
algorithm  since  it  involves  a  process  of  adding  new  tests  and  new 
nonexecutable  paths  to  the  program  and  eliminating  other  tests  that 
become  useless  and  paths  that  become  nonexecutable. 

We  showed  how  Boolean  manipulations  of  the  tests  eliminated 
nonexecutable  paths  and  therefore  reduced  the  cost  of  the  program. 
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Optimizing  the  program  by  eliminating  nonexecutable  paths  can  also  be 
done  when  we  have  additional  information  on  the  tests.   As  an  example, 
we  will  show  how  the  use  of  the  test  for  equality  may  eliminate 
nonexecutable  paths.   If  we  have  the  program 

A  -  cpCD 
B  <-  cpCD 


v(A)  -  v(B)  and  the  right  path  can  never  be  executed. 

This  case  is  similar  to  the  case  in  which  algebraic 
identities  are  known  to  hold  among  operators.   Here  the  additional 
information  is  on  the  tests.   Here  the  model  of  program  schemata  is 
as  before,  only  test  statements  include  tests  instead  of  test  names. 
The  following  cases  are  possible: 

(i) 

( X-,  , .  •  •  X  ,  Y,  , .  .  .  Y  ) 
v  ±'         n'  1'    n 


v'(X. )  =  v  (Y.)     1  <  i  <  n. 
l        i       —   — 

The  marked  path  is  nonexecutable  and  can  be  eliminated. 
(ii) 


»L»-V  =S(V-V 


where  v  (X±)  =  v£(Y±)     1  <  i  <  n,  and  g  is  any  function.   The  marked 
path  is  nonexecutable. 
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(iii) 


P(X1,...Xn)4=*P(Y1,...Yn) 


v  (X^  =  v  (Yi)  1  <  i  <  n,  and  P  is  a  proposition. 


=(X-|  ,...X  ,  Y,  , . . .  Y  ) 
Vr     n   1'    n 


(iv) 


g(X1,...Xn)=g(Y1,...Yn) 


g  is  any  function.   Here  the  values  of  X.  and  Y.  might  be  different 
for  some  i  (i.e.  the  expressions  are  different)  but  the  vectors  (X.) 
and  (Y. )  might  be  equal  under  some  interpretation.   The  marked  path  is 


nonexecutable, 
(v) 


P(X1,...Xn)  =  P(Y1,...Yn) 


=(X,  ,...X  ,  Y,,...Y  ) 
Al     n'   1'    n 


P  is  a  proposition. 

All  these  transformations  preserve  equivalence  of  programs. 
A  scheme  for  optimization  similar  to  that  of  Theorem  15  can  be 
constructed  for  the  cases  above. 
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PROGRAM  SCHEMATA  WHICH  ALWAYS  HALT 


The  equivalence  problem  for  the  class  of  program  schemata 
which  always  halt  is  decidable,  although  membership  in  this  class  is 
not.   (See  Luckham,  Park  and  Paterson  (8)). 

The  decision  procedure  of  (8)  is  based  on  the  fact  that  for 
each  program  that  halts  under  all  interpretations  we  can  construct 
effectively  an  equivalent  loop-free  schema.   We  unwind  the  loops, 
discarding  any  paths  which  are  nonexecutable.   If  this  process  continues 
indefinitely,  the  hypotheses  of  Koenig's  lemma  are  satisfied,  implying 
that  there  is  an  infinite  executable  path  and  thus  the  program  diverges 
under  some  interpretation.   Therefore  for  programs  which  always  halt 
this  procedure  must  produce  an  equivalent,  finite,  loop-free  program. 
Thus  the  decision  procedure  of  Theorem  2  applies  also  to  programs  which 
always  halt. 
EXAMPLE  20 

Produce  an  equivalent  loop-free  program  for  a  program  which 
always  halts. 
II: 


L2  <-  FL2 


STOP 
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.s  an  equivalent  loop-free  program: 


L,  -  FL 


STOP       t(Lx) 
L2-FL2 


STOP 


v(L1)=FL1 


STOP     v(L  )=FFFL 


STOP 


v(L1)=FFL1 


STOP      v(L  )=FFFFFL 


v(L1)=FFFL1 


The  last  test  statement  t(L,  )  tests  the  same  value  as  does  the  statement 
marked  by  (*).   Therefore  the  left  path  is  nonexecutable  and  the  procedure 
stops.  We  get  an  equivalent  loop-free  program. 

All  the  transformations  that  operate  on  loop- free  programs  can 
be  used  to  simplify  programs  which  always  halt.   We  will  denote  by  T12 
the  transformation  that  maps  programs  which  always  halt  to  loop-free 
programs.   Because  T12  eliminates  nonexecutable  paths,  every  program 
which  always  halts  is  transformed  by  T12  to  a  loop-free  program  in  which 
each  path  is  executable. 
THEOREM  16 

Let  Y.   and  1 '  be  two  programs  which  always  halt.   Then 


n 


(1,2,6,7,9,10,11,12) 


¥   ir 
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The  proof  is  obvious  by  using  the  results  of  the  previous 
chapters . 

By  using  T12  first,  and  then  applying  the  procedure  of 
Theorem  10  for  optimizing  loop-free  programs,  we  might  optimize 
programs  which  always  halt,  under  reasonable  cost  functions.   The 
following  transformations  that  operate  on  programs  with  loops  only, 
and  are  equivalent  to  sequences  of  transformations  of  T1-T12,  might  be 
used  by  the  optimizing  process. 
T13  Moving  loop- independent  statements  out  of  the  loop 


Let  S-:  A  *-  cpB, .  . .  B  be  a  statement  in  a  loop  such  that 
B  , . ..,B  are  not  defined  by  any  statement  in  the  loop.   Then 

(1)  if  A  is  not  referenced  by  any  statement  in  the  loop  S.  can  be 
moved  out  of  the  loop  either  backwards  or  forwards. 

(2)  if  A  is  referenced  by  some  statement  S.  in  the  loop  S.  can  be 


moved  backwards  out  of  the  loop. 


S.  :  A*-cpBn.  ..B„ 
1    .1    r 


PROPOSITION  1 


If  .1 


(13) 


=^IT,      then     IT 


(1,2,U,12J 


>  IT   • 


PROOF: 


T12 


A-cpB1...Br 


A«-cpB  . .  .  B3 


!...Br 
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By  T12  we  get  a  corresponding  loop-free  program. 

(1)  If  A  is  not  referenced  by  any  statement  in  the  loop,  all  the 
statements  except  the  last  are  useless  and  can  be  eliminated  by  several 
applications  of  Tl.   The  last  statement  can  be  flipped  forwards  by 
applications  of  TU  so  that  it  will  be  moved  out  of  the  loop. 

(2)  If  A  is  either  referenced  or  not  by  statements  in  the  loop,  we 
might  use  T2  several  times  to  remove  the  redundant  definitions  of  A, 
so  that  only  the  first  definition  is  remained.   Then  we  use  Tk   to  flip 
the  first  statement  backwards  to  move  it  out  of  the  loop. 

The  reverse  of  T12  is  used  to  transfer  the  loop-free  program 
back  to  a  loop  program. 
Tl!;  Merging  of  identical  subgraphs  which  include  loops 


T1U 


We  assume  that  the  loop 


always  halts 


PROPOSITION  2 


If 


{I'-O 


b>       ' ,    then     n 


• 


{11,121 


118 


PROOF: 


T12  transforms  IT  to  a  loop-free  program,  then  Til  merges  the 
identical  subgraphs  and  the  reverse  of  T12  transforms  the  loop-free 
program  back  to  a  loop  program. 
T15  Elimination  of  nonexecutable  loops 


STOP 


The  loop  can  never  be  executed. 
PROPOSITION  3 
If  JI 


>  IT',   then   II     >  IT' 
U5T  (12T 


PROOF : 

The  procedure  that  transforms  a  program  which  always  halts 
to  a  loop-free  program,  eliminates  nonexecutable  paths. 
Tl6  Merging  of  loop-free  code  with  loops 


Tl6 


t3(L2) 
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The  two  programs  give  the  same  loop-free  program.   The  program  on  the 

right  is  more  efficient. 

PROPOSITION  h 

If  IT  >  IT ' ,      then     II    =^=^  II '    . 

16  12 

: 

T12  transforms  ^  to  a  loop- free  program,  which  is  transformed 
['  by  the  reverse  of  T12. 

The  optimizing  process  uses  T13-T16  together  with  the 
transformations  Tl-Tll. 

T13  might  he  used  to  reduce  the  program. 

T15  is  used  in  the  stage  where  nonexecutable  paths  are 
eliminated  from  the  program. 

Til*  is  used  whenever  Til  is  used  to  merge  subgraphs  of  the 
program.   In  this  stage  we  can  also  apply  Tl6  to  merge  as  many  loop-free 
codes  with  loops  as  possible. 
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